Final Project

I want to combine my background in AI application development with TMG's focus in Tangible Interface to build a voice-driven AI programming system inspired by telephone switchboard operators.

The inspiration

I’m motivated by how AI-routed phone systems have eroded the empathy and connection once provided by human operators. I want to revive that craft by building a switchboard that puts a person back in the loop as a thoughtful listener and connector.

Switchboard operator Jersey Telecom Switchboard and Operator (source)

The idea

A physical AI agent network implemented as a hardware grid with voice-based interaction and programming capabilities. The system combines push-to-talk interfaces with node-based generative AI computation, allowing users to dynamically program and interact with AI agents through voice commands. I want to call this system Field Programmable Generative AI (FPGAI)

Concept sketch My initial sketch

Next, I want to visualize the idea with gen AI. I'm entirely new to 3D modeling and rendering, so the fastest route to gain intuition on the form of the design is naturally using AI.

I crafted the prompt based on what I was imagining. The latest gemini model got this for me in one shot.

Base Device base (prompt)

Next, let's visualize the hand-held device. I want to model it after a CB radio speaker mic. Inspired by this project

Hand unit Hand unit (prompt)

Finally, let's put them together and add some context. I haven't decided the exact size for each component yet. I think that will have to wait until I figured out the electronics first.

In use In use (prompt)

The implementation

While it's still too early to fully specify the project, I have the following high level design.

Main Board

Speaker-Microphone Units

Operating Modes

Interaction Mode

Programming Mode

After the conceptual exploration from week 1, I switched focus to the electronics. I hope the electronics design can help inform the exterior of the system.

I started off with off-the-shelf components and iterated the idea to build more from sractch.

Proof of concept with off-the-shelf components

I can prototype almost the entire experience with cheap off-the-shelf products:

  1. Push-to-talk with a secondhand CB radio hand unit
  2. Audio cable adapters to 3.5mm TRRS
  3. USB hub for multiple inputs

Prototype 1 Prototype using consumer electronics

What's missing:

  1. No effort involved. This will result in a failing grade. It's only good for prototyping
  2. Can't guarantee the compatibility of the hand unit with the 3.5mm TRRS jack
  3. Can't prototype the visual feedback feature, where the 3.5mm jack shows "ready" state to the user via an LED

Bring intelligence to the main body

Iterating on the idea, I could use a Raspberry Pi with a primitive USB hub as the main processor. The Pi may still use a nearby laptop for LLM and speech-to-text, text-to-speech, but it's also possible to bring the entire AI/ML stack onto the device, reducing the need for networking.

Prototype 2 Moving compute to Raspberry Pi

I still need to figure out how the Pi can use the LEDs to display system state. Besides, I need to program some microcontroller to meet the requirements of this class. Can we go one level deeper?

Move audio processing to hand unit

To make the project more challenging, I can use an ESP32-based audio system to pick up speech and play back AI voice. We can wirelessly connect the ESP32 with a nearby laptop, where the voice-driven AI interactions will take place.

The main body still needs a controller to send the following information to the nearby laptop:

  1. Detect which socket is plugged in
  2. Control the LED status lights

Prototype 3 Audio processing in hand unit

The audio cable in this design does not really pass audio. It is solely used for detecting the state of plugged/unplugged. I need to figure out how to rig the 3.5mm jack to achieve this.

Build my own speaker/microphone

The next level is replacing the ESP32-based audio kit with a custom PCB, with speaker and microphone manually soldered. This will probably be the upper bound of the level of complexity I can handle.

Prototype 4 Build microphone and speaker on custom PCB

My next step is taking the idea to a TA for advice. This is my first time designing with electronics, so I do anticipate big revisions. Stay tuned.

Networking

Learning about embedded programming validated the design above. After getting hands-on experience building an echo server with ESP32, I now feel confident that I can relay data between the ESP32 hand unit and a nearby laptop using either a Wi-Fi or a serial connection. Next, I can explore several things in parallel:

Electronics design update

I consulted with our TA Quentin Bolsee regarding electronics design and received valuable help on input/output devices. I also conducted additional research using YouTube tutorials from atomic14, which enabled me to fully spec out the electronics for both components.

The hand unit will be built around a Xiao ESP32-C3 microcontroller, with upgrade options to ESP32-C6 if WiFi performance becomes a bottleneck, or to WROOM-32E if more GPIO pins are needed. For audio processing, I've selected the ICS-43434 I2S MEMS Microphone for input and the MAX98357A I2S Class D Amplifier paired with an 8-ohm speaker for output. However, the amplifier's specified response frequency of 600hz - 4000hz may not be ideal for voice applications, so I might need to find an alternative. The physical interface will include two buttons (single button for push-to-talk, both buttons for broadcast) and two switches (Power On/Off and Mode switch for interaction/programming). Power will come from a 3.7V LiPo battery, with a potential upgrade to a 3AA battery pack plus voltage regulator for easier replacement and a more vintage feel, though I need to consult with an electronics expert about the implementation details. Connectivity will be handled through a 3.5mm TRRS jack.

The main unit uses a simpler design with a Xiao ESP32-C3 microcontroller controlling 4 LEDs and 4 3.5mm TRRS jacks for the 2x2 grid configuration.

High-level schematic High-level design for the electronic components

For connection detection, I want to eventually support multiple hand units speaking simultaneously, which requires tracking which hand unit is plugged into which jack. Traditional physical TRS plug detection doesn't differentiate between different plugs, so I propose using TRRS jacks as a clever hack. By treating high/low voltage as 1/0 bits and using the sleeve as ground while the other 3 connections serve as signal lines, I can create 2^3 = 8 unique values. This allows each jack in the 2x2 grid to be uniquely identified by a 3-bit code. The main unit will be responsible for pulling up/down the 3 signal lines on the jacks, while the hand unit decodes the 3-bit code and sends it to the laptop along with its own unique ID.

TRRS socket TRRS socket has 4 pins

This design enables all necessary communication between the PC, hand unit, and main unit: hand unit plug-in messages with 3-bit codes and wireless IDs, audio streaming from hand units to PC using wireless IDs, audio streaming from PC to hand units using wireless IDs, and LED state updates from PC to main unit using 3-bit codes to identify specific jacks.

Parts list

Component Quantity Availability Notes
Xiao ESP32-C3 2 Out of stock 1* for hand units, 1 for main unit
ICS-43434 I2S MEMS Microphone 1* Stocked
MAX98357A I2S Class D Amplifier 1* Stocked
PSR-57N08A01-AQ 8-ohm speaker 1* Stocked
3.5mm TRRS jack 5 Need to order 1* hand unit + 4 main
TRRS audio cable 1* Need to order
3.7V LiPo battery 1* Need to order
Button 2* Stocked
Slide switch 2* Stocked
LED 4 Stocked
3AA battery pack + voltage regulator 1 Optional Alternative power solution

*For a single hand unit. Need more for additional units

With this design update, it became clear that the main unit is essentially a "dumb" device that encodes the TRRS socket and displays which AI agent is speaker and doesn't care about audio processing at all.

I have also gained insights how the physical contraints for the housing. The hand unit needs to mainly account for battery and speaker size. The PCB size and shape can be more flexible. The main unit needs to account for the 4 TRRS jacks.

Here are new and remaining questions which I plan to resolve by going to TAs as well as attending future lectures.

  1. PCB design. atomic14's design is a good reference but I don't know how I can design my own.
  2. Packaging design. How do I hold the components in place? especially the 3.5mm TRRS jacks which will receive physical stress.
  3. Physical interaction. How do I put buttons and sliding switch on the hand unit? I want a good tactile feel.
  4. LED lighting. How do make a ring that lights up around the TRRS socket?
  5. The CBA electronics shop inventory doesn't match what the website says. For example, the ESP32s are out of stock but the website didn't reflect that.

And here are the things I can prototype now:

  1. Play voice from ESP32 over WiFi
  2. Capture sound from ESP32 over WiFi
  3. Address and light-up 4 LEDs with ESP32
  4. Encode and decode TRRS identities between two ESP32 boards
  5. Design a case roughly based on atomic14's PCB foot-print.

Sound output

With the help from our TA Quentin Bolsee, I installed the official ESP32 board manager following its documentation. Then I installed the specific library for Arduino ESP32 Nano from the board manager.

I used the official example code to play a square wave tone, with a few lines of modification to set the right output pin. Here is the full source.

#define I2S_BCLK D7
#define I2S_LRC  D8
#define I2S_DIN  D9

Sound output from ESP32 using MAX98357A amplifier

I found a powerful library for audio processing by Phil Schatzmann, called Arduino Audio Toolkit. After studying his examples, I was able to get my computer to send live microphone audio to the ESP32 over WiFi, and play it back immediately. The latency is about 1 second, which concerns me but isn't a deal breaker.

This POC validated the idea that we can shift all the computation to a PC nearby and let ESP32 handle audio input/output.

Latency test result: 1 second delay

PCB Design

I designed both the hand-held device (Operator) and the main body (Switchboard) as part of this week's PCB design exercise. See details in the weekly post.

PCB Production

I milled boards for both the Operator and the Switchboard using the Carvera Desktop CNC Machine. See details in the weekly post.

Case Prototype

I designed a simple box for the Switchboard in Onshape, featuring an elevated platform for the M2 mounting screws, a hole for the USB-C connector, and a simple enclosure for TRRS jacks that would allow for easy assembly.

Switchboard case design Switchboard case design (model)

However, the printing process turned into a series of challenges. I experienced repeated failures while printing PETG across multiple machines, despite following the precise specifications. I could only suspect the filament quality was poor.

Printing failure 1 Base layer delamination during printing

Printing failure 2 Spaghetti from the side

During one of the jobs, the filament ran out, and bridging in a second roll made the interface terrible. In another attempt, one filament got entangled with itself inside the spool, causing the machine to stop.

Printing failure 3 Entangled filament caused spaghetti

In the last version, I switched to PLA and successfully printed the case.

Successful print Successful print with PLA

Upon a quick assembly test, I took these notes for the next iteration:

  1. The USB-C connector was positioned at the wrong height.
  2. It might be simpler to slide the PCB into position rather than using screws for mounting.
  3. I discovered that for anything using screws, M3 is a much easier size to work with.
  4. The lid is desirably tight, but I need to create a small lip of a gap on the case to make it easier to open.

Assembled case Assembled case

With lid With lid

Microphone

I took advantage of the input device week to prototype the microphone interaction. I was able to implement the entire input pipeline:

  1. User holds button to talk, microphone picks up voice, user releases button to stop recording.
  2. Audio sent over WiFi to nearby laptop via UDP.
  3. Laptop streams audio to OpenAI Realtime API for text response.
  4. Laptop uses text-to-speech to generate audio response and plays it immediately.

The TRRS Connector

I found a dozen TRRS male and female connectors in my lab. They look nicer than the SMD version I originally planned to use. But without the datasheet, I need to reverse engineer the schematic. So I probed them with a multimeter and confirmed the internal connections.

TRRS pinout TRRS pinout diagram

The female connectors will be mounted just under the lid of the Switchboard case. I still need to figure out how to fabricate and attach the cables.

Next steps:

Speaker

During the Output Device week, I completed the full voice interaction loop, with voice-in, voice-out, and AI processing in between.

Full voice interaction loop demo

Hanlde Unit Form Study

I used a piece of paper to sketch out the form factor, just so I can hold it in my hand and feel the size.

Low-fidelity hand unit Extremely-low-fidelity hand unit prototype

Using this prototyp, I laid out the components and realized I might have to increase the dimension to make everything fit.

Component layout Layout option 1

Component layout Layout option 2

Battery

The ESP32 board has a well documented charging circuit for 3.7V LiPo batteries. Knowing that working with battery is a bit risky, I decided to start with a simpler USB-C power bank. This is the smallest option I found on Amazon:

USB-C power bank USB-C power bank for hand unit

Mounting mechanism

There are several mounting challanges. I have investigated the mounting strategy for PCB and TRRS jack this week.

I produced difference sizes of mounting bracket (download STEP file) to find the right size.

Bracket model Modeling the brackets with offsets from the measured board size

PCB mounting bracket test PCB mounting brackets

PCB mounting test PCB mounting test

I observed that for the 28mm board, a +1mm offset (28.1mm bracket) would make a good fit. I need at least 2mm height to clear the solder joints.

I also 3D printed a model to test the TRRS jack mounting.

TRRS test model TRRS jack test model

TRRS mounting test TRRS jack mounting test

Testing revealed that 7mm diameter is the best fit, and max thickness can be 2.5mm. This means I can use 2mm thick walls for the case.

Mid-term review

Remaining tasks

  1. Fabricate Operator
    • Design button mounting mechanism (2 buttons)
    • Design speaker mounting mechanism
    • Model updated case and enclosure
    • Solder TRRS connector
    • 3D print and assemble
    • 3D print button caps
  2. Fabricate Switchboard
    • Order vintage LEDs
    • Design LED mounting mechanism
    • Solder TRRS connectors
    • Model updated case and lid
    • 3D print and assemble
  3. Software
    • Implement multi-agent simulation
    • Implement automatic server IP discovery
    • Implement diagnostic UI

Stretch goals:

Delivery plan:

Questions for TA:

  1. Button, speaker, LED mounting mechanism?
  2. Battery + Power: what kind of switch should I use? How to mount?
  3. Slider switch: what options do I have? How complex?

Physical Assembly Test

During the mid-term review, Alan Han suggested mounting options and power solutions.

Because Thanksgiving travel was approaching, I wanted to use lab time to test the mounting as soon as possible.

I updated the 3D models to account for the speaker, audio jacks, and buttons. My CAD speed was improving. In half a day, the updated models for both the Operator and the Switchboard were ready for printing.

Updated Switchboard Updated Switchboard model

Updated Operator Updated Operator model

These were test prints intended only to reveal design issues, so I used a 0.15 mm layer height and 15% infill for a quick turnaround. Production prints would be much finer.

3D printed parts Slicing for speed

For assembly I skipped soldering so components could be moved and adjusted — a deliberately "wireless" assembly.

Wireless assembly Assembly without soldering

The physical prototype immediately revealed several problems:

Button collision issue Buttons colliding with speaker

TRRS collision issue TRRS barrel colliding with PCB header

I quickly revised the design so the components sat in the correct positions. That update produced the first successful physical assembly of the system.

Case Updated case to address the issues found in the assembly test

In context Mounting all the components (except for wiring)

Unboxed view Unboxed view of the system

Validating LED

I validated the LED connection and voltage design with a simple program that blinks all the LEDs. To make it interesting, I added the PWM-like brightness fading effect by rapidly toggling the LEDs on and off with varying on-time to simulate different brightness levels.

/*
ESP32 LED Pulse using PWM simulation by rapidly
toggling LEDs on and off with varying on-time to control brightness.

Pinout:
LED1: D0
LED2: D1
LED3: D2
LED4: D3
LED5: D7
LED6: D8
LED7: D9
LED8: D10
*/


const int ledPins[] = {D0, D1, D2, D3, D7, D8, D9, D10};
const int numLeds = 8;

const int PERIOD_US = 5000;

void setup() {
  for (int i = 0; i < numLeds; i++) {
    pinMode(ledPins[i], OUTPUT);
    digitalWrite(ledPins[i], LOW);
  }
}

void loop() {
  for(int brightness = 0; brightness <= 255; brightness++) {
    int on_us = map(brightness, 0, 255, 0, 500);
    int off_us = PERIOD_US - on_us;
    for(int i = 0; i < numLeds; i++) {
      digitalWrite(ledPins[i], HIGH);
    }
    delayMicroseconds(on_us);
    for(int i = 0; i < numLeds; i++) {
      digitalWrite(ledPins[i], LOW);
    }
    delayMicroseconds(off_us);
    delay(1);
  }

  for(int brightness = 255; brightness >= 0; brightness--) {
    int on_us = map(brightness, 0, 255, 0, 500);
    int off_us = PERIOD_US - on_us;
    for(int i = 0; i < numLeds; i++) {
      digitalWrite(ledPins[i], HIGH);
    }
    delayMicroseconds(on_us);
    for(int i = 0; i < numLeds; i++) {
      digitalWrite(ledPins[i], LOW);
    }
    delayMicroseconds(off_us);
    delay(1);
  }
}

LED test LED test successful

In my circuit, I used a 100-ohm resistor in series with each LED rated at 1.9V forward voltage and 20mA forward current. Assuming a 3.3V supply from the ESP32, the current through the LED would be approximately (3.3V - 1.9V) / 100 ohms = 14mA, which is bit low.

Double checking the math using DigiKey's LED Resistor Calculator, the desired resistor should be 70 ohms. So choosing 100 ohms was ohms safe.

Knowing that the TRRS addressing works from Week 6 and the LED output works from this test, we are ready to connect all the components in Switchboard!

Soldering LED Soldering LED, one by one

During the testing, several LED legs were snapping. I had to apply hot glue as reinforcement.

Reinforcing LED legs Reinforcing LED legs with hot glue

With several hours of non-stop soldering, all the LEDs were finally blinking!

Testing all lights in sequence

Wiring up the Operator

Aftering wiring up the Switchboard with 2 days of non-stop soldering, I gained significant experience and had all the equipment dialed in. I also switched to single-core 22 AWG wire for stronger joints. The wire-up was a breeze.

Wiring up the Operator Operator, fully wired up

Remaining tasks: