Project Description: This project builds an ultra-low-power, event-driven hardware accelerator for real-time speech processing. We will start from the open-source Neuromorphic Auditory Sensor (NAS) feature extractor and integrate it with the EdgeDRNN accelerator to perform keyword spotting (KWS) on the Google Speech Commands Dataset. The student will first implement the EdgeDRNN classifier on the NAS FPGA platform (equipped with PDM microphones and an Artix-A200 FPGA) to process spike-based audio features in real time. Following a successful FPGA prototyping phase, the student will extend the digital design to an ultra-low-power ASIC architecture.
The student has a wide range of freedom to co-design the machine learning model and the hardware. You will be encouraged to innovate on the NAS feature extraction architecture (e.g., spike-based cascade filter banks), neural network classifier algorithms (e.g., DeltaGRU or Spiking Neural Networks), model compression methods (quantization, pruning, and temporal sparsity/delta-network thresholds), and the neuromorphic hardware accelerator data path. We will target end-to-end latency, high classification accuracy, and extreme energy efficiency (performance-per-watt) down to the microwatt range. We highly welcome students motivated to publish their findings in top-tier solid-state circuit conferences/journals (e.g., ISSCC, VLSI, JSSC).
Ethics & consent: All standard experiments will utilize the publicly available Google Speech Commands Dataset. Any live audio testing and custom dataset recordings collected for real-time hardware evaluation must use voices with explicit written consent and adhere to privacy-preserving edge AI principles.
Preferred Skills:
Digital IC/FPGA design (Verilog/SystemVerilog), toolflows (Xilinx Vivado, Cadence/Synopsys ASIC design flows)
Python & PyTorch
Deliverables:
A novel dynamically sparse neural network classifier optimized for NAS spike-based auditory features.
Fully functional FPGA accelerator prototype on the NAS FPGA platform demonstrating real-time end-to-end KWS inference.
Extended ultra-low-power digital ASIC design (RTL-to-GDS).
A high-quality manuscript prepared for submission to a top-tier solid-state circuits venue.
Reference Material:
Neuromorphic Auditory Sensor (NAS) Slides (Link)
EdgeDRNN: https://arxiv.org/abs/2012.13600
ASIC Baseline: A 23µW Solar-Powered Keyword-Spotting ASIC with Ring-Oscillator-Based Time-Domain Feature Extraction (ISSCC 2022): https://ieeexplore.ieee.org/document/9731708
Contact Person: Dr. Chang Gao (Chang.Gao@tudelft.nl)
The student will be supervised by Dr. Chang Gao (TU Delft) and co-supervised by Prof. Antonio Rios-Navarro (Univ. of Sevilla, Spain) and Prof. Juan P. Dominguez-Morales (Univ. of Sevilla, Spain), and Dr. Juan Manuel Montes-Sánchez.
Project Description:
This project builds an energy-efficient FPGA-based accelerator for real-time voice cloning. We will start from the open-source pipeline Real-Time-Voice-Cloning (GitHub; YouTube demo) and replace the backbone TTS model with EfficientSpeech (paper) to reduce compute and memory costs while preserving naturalness. The student will co-design the model and hardware: apply quantization, pruning, and sparsity/weight-sharing to the encoder/decoder blocks, then map the kernels (mel-spectrogram, attention/FFN, vocoder) to an FPGA data path. We will prototype on the Avnet MiniZed board (MiniZed), targeting end-to-end latency, intelligibility (WER), MOS-style quality metrics, and performance-per-watt.
Ethics & consent: all cloning experiments must use voices with explicit written consent and include a “cloned audio” watermark.
Preferred Skills:
FPGA design (Verilog/SystemVerilog), toolflows (Vivado/Vitis)
Python & PyTorch
Deliverables:
Compressed EfficientSpeech-based TTS model integrated into the Real-Time-Voice-Cloning pipeline
FPGA accelerator on MiniZed with real-time inference (encoder/decoder + vocoder offload)
Reference Material:
Real-Time-Voice-Cloning: https://github.com/CorentinJ/Real-Time-Voice-Cloning
Demo video: https://www.youtube.com/watch?v=-O_hYhToKoA
EfficientSpeech paper: https://arxiv.org/abs/2305.13905
Avnet MiniZed board: https://www.avnet.com/americas/products/avnet-boards/avnet-board-families/minized
Contact Person:
Dr. Chang Gao (Chang.Gao@tudelft.nl)
This project investigates the integration of neural networks into the signal acquisition pipeline to enhance the performance of low-resolution, low-power analog-to-digital converters (ADCs). Traditional ADCs are designed to maximize signal fidelity across a wide range of applications, but this project takes a domain-adaptive approach by pairing simplified ADC front-ends with task-specific neural post-processing to reconstruct or enhance the acquired signals.
The student will work on a prototype system built around a commercial off-the-shelf ADC (e.g., from Texas Instruments) on a PCB, and implement neural enhancement models in PyTorch to process the raw ADC outputs. The project will include hardware-in-the-loop training and inference, with optional deployment on embedded platforms such as Raspberry Pi or FPGAs. Emphasis will be placed on evaluating power, latency, and accuracy trade-offs in an end-to-end pipeline.
Preferred Skills: PyTorch, Signal Processing Fundamentals, Embedded Systems, PCB Prototyping, Verilog (optional for hardware acceleration)
Contact Person: Dr. Chang Gao (Chang.Gao@tudelft.nl)
This project aims to develop an AI-driven calibration system for phase-locked loop (PLL) non-idealities, such as jitter, phase noise, and locking instability, by adapting the OpenDPD framework, traditionally used for power amplifier linearization. Leveraging MATLAB for system-level PLL modeling and PyTorch for training neural networks, the project will automate the identification and compensation of non-linear behaviors in mixed-signal circuits. The deliverable: An open-source simulation package enabling AI-calibrated PLLs for high-precision communication systems.
Preferred Skills: MATLAB/Simulink, PyTorch, RF system fundamentals.
Contact Person: Dr. Masoud Babaie (M.Babaie@tudelft.nl) and Dr. Chang Gao (Chang.Gao@tudelft.nl)
Project Description:
This project focuses on leveraging AI-driven automation to co-design neural network accelerators for real-time edge applications. The goal is to integrate software optimization techniques such as quantization and pruning with hardware-aware neural network training to develop an energy-efficient accelerator on FPGA and ASIC platforms. By automating the end-to-end design process using reinforcement learning and AI-driven reasoning, the project aims to bridge the gap between deep learning frameworks and hardware implementation.
Preferred Skills:
Deep Learning frameworks (PyTorch, TensorFlow)
Hardware Description Languages (Verilog/SystemVerilog)
FPGA/ASIC design flow (Vivado, OpenLANE, Cadence)
Contact Person: MSc Ang Li (ang.li@tudelft.nl) and Dr. Chang Gao (Chang.Gao@tudelft.nl)