BeagleBoard/GSoC/Modern Speak and Spell
- 1 ProposalTemplate
- 2 Status
- 3 Tasks Done
- 3.1 About Me
- 3.2 Modern “Speak & Spell” using PocketBeagle
- 3.2.1 Introduction
- 3.2.2 Modern “Speak & Spell” project overview
- 3.2.3 Detailed Description
- 3.2.4 Progress till now
- 3.2.5 Timeline:
- 3.3 Final Goals:
- 3.4 Experience
- 3.5 Benefit
- 3.6 Future Contributions
Recreating and improving the functionality of the previous Speak and Spell toy by Texas Instruments, and generate open-source code for it.
Student: Anirban Banik
Mentors: Jason Kridner
To celebrate the 40th anniversary of the “Speak & Spell” from Texas instruments, this proposal was accepted in order to create an updated “Speak & Spell” using a PocketBeagle.
I have completed the task required as described on the ideas page, and created a pull request, as listed here
IRC: AnirbanBanik1998 || Anirban
E-Linux Username: AnirbanBanik1998 School: NIT Durgapur, Durgapur, West Bengal
Primary language: English,Hindi,Bengali
Typical work hours: 9:30 - 23:00 IST GMT/EST/PST to Ist Adjusted Time
Previous GSoC participation: This is my first GSoC participation. Got interested in open-source the day I started with it, and naturally learnt about GSoC soon. Really excited to work with an open-source community, and generate useful open-source code.
Tools(proficient) : Git,Linux
Hardware Skills : Arduino, Raspberry Pi(basic)
Modern “Speak & Spell” using PocketBeagle
Project name: Modern “Speak & Spell” using PocketBeagle
To rebrain the Speak and Spell with a general Linux application, that can be reproduced, and is not an one-off build.
My approach to this problem will be more in the software domain, and some basic wirings in the hardware domain.
Modern “Speak & Spell” project overview
The Speak & Spell was an electronic hand-held computer first introduced in 1978, and instantly becoming a favourite of children and adults alike. that consisted of a TMC0280 linear predictive coding speech synthesizer, a keyboard, and a receptor slot to receive one of a collection of ROM game. The improvements that I am trying to bring forth in this project are, Speech to Text along with Text to Speech functionality. For the former, I am thinking of using CMU Sphinx, and for the latter I will use CMU Flite which is specifically designed for small-scale embedded systems. I am aiming at implementing speech-recognition for not only recognizing the spelling of the user, but also as a voice launcher for launching the games at the user's command. This can be specifically helpful for those who don't have a keyboard or don't want to use it. This is, of course, an added functionality, as the typing functionality will always be there.
Basically, there are four games to be implemented on this device, "Spell It", "Hangman", "Encrypter" and "Crossword". On Starting the Application:
1. The user will be prompted to select a game among the given list of games. He can either launch the game using his voice or keyboard. Thus the voice-launcher scheme works here.
2. The "Spell It" game prompts the user to speak out the spelling of a word spoken out by the application. The word is from an already created dictionary. Here Text to Speech works.
3. The "Hangman" game has two levels. In level 1 it will ask the user to fill in the blanks to create a word. This game is so made that if any letter is entered or spoken by the user, which is repeated in the answer, then the letter is placed at every blank it is supposed to be in one go. In level 2, the letters are visible, but in a jumbled way. The user has just to move each letter to it's actual position before the time runs out.
4. The "Encrypter" game is a multi-user game. One encrypts a word in a certain way, and the other has to decrypt it to get back the original word.
5. In the "Crossword" game, the user has only to specify where he is going to work, i.e. which row or column, then proceed as in the "Hangman" game level 1, the difference being just that each word generates another word in another row or column.
A rough diagram of the "Crossword" game.
Finally, scores will be generated for every game along with suggestions to improve.
Progress till now
I have started working on this project in order to brush up my understanding about how this application is going to work. Firstly, I started with recreating the original games, so that I might be able to add more functionality to them later. Finished building two games, the "Spell It" and "Hangman" and have started work on "Encrypter". Regarding the usage of CMU Sphinx, till now I have restricted myself to only the English Speak and Spell which I aim to extend to the other languages as well. The usage of CMU Sphinx requires a dictionary of words to be recognized be made. I have done it manually, and made a phonetic dictionary of it from the CMU Sphinx website. This dictionary contains words to start and stop games, and alphabets A to Z for aid in Spelling.
A rough functioning of the Hangman Game.
Started with implementation of Speech to text. Gradual progress is being made in it, adding to my experience as well. We have to work in two terminals simultaneously, one for launching the PocketSphinx, and another for the actual games.
I have already started work on this project and will try to extend this project to other languages as well as develop the project on beaglebone once selected.
Community Bonding Period
Get acquainted with the documentation of beaglebone, improving on my previous code, and discussing about any further improvements in depth with the mentors.
Preparing the introductory presentation slides and video. Deploying the CMU Sphinx and CMU Flite on PocketBeagle, to be used later.
Getting the OLED/Display working and working on interfacing with an I2C GPIO expander.
Week 3 - 4
Working on a web-crawler to crawl the Wikitionary. This can be used to parse Wikitionary pages, thereby generating the wordlists that can be used later.
Improving the "Spell It" game from it's previous version. I will change from "espeak" to "CMU Flite" because it has the extended functionality of accent, so that people from different areas can understand it's speech.
Week 6 - 7
Working on the User Interface of the "Hangman" game. In level 1, I will work on improvising on the "time constraint" to make the game interesting. In level 2, I will have to work on the Speech-Processing unit a bit more, because giving such commands over the microphone is challenging.
Week 8 - 9
Building the "Encrypter" game, adding multi-user gaming functionality. Creating a database to store data about the various ciphers and creating functions to encrypt the words accordingly. Allowing addition to the database from time to time.
Week 10 - 11
Building the "Crossword" game, as an extension of the "Hangman" game, and creating pop-ups providing hints at specific points of time, help users in case they are stuck.
Testing, fixing bugs, merging. Improving documentation for more widespread use by the open-source community.
Prepare the final presentation slides and video.
A Modern Speak and Spell using PocketBeagle, all of whose parts are reproducible, and generating open-source code for it, for utilizing it in a more widespread way.
Though I had Computer Science as a subject in my school, yet was never really aware of all the things going on in this domain, till I reached college. Was fascinated by the progress occurring in this field. I was particularly impressed by the freedom one gets in Open-Source. I would like to contribute to this community, not only because of my interest in hardware in general, but because this project as a whole appealed to me.
Previous Contributions to Open-Source
- Maintained a repository on Socket-Programming in Python for HacktoberFest. https://github.com/lugnitdgp/Socket-Programming
- Made a multi-client Chat-Server on a single host computer in Node.js as a learning process. https://github.com/AnirbanBanik1998/Chat-Server
- Made a Cleanup application to cleanup memory space filled up by unused apps. https://github.com/AnirbanBanik1998/Cleanup
- As a part of GSoC-Heat, contributed to MemeFinder, a search engine to store memes scraped from Reddit.com https://github.com/NITDgpOS/MemeFinder/commit/bed08441e8a75b0e6aa6afa12ce42ef10dcbe891 https://github.com/NITDgpOS/MemeFinder/commit/04c4ff1c3a407327be7f1918052c8626cf0eea2e
Contributions to this project
As there was no upstream repository to contribute to, I started recreating the games myself, and worked on adding Speech to Text and vice-versa functionalities. https://github.com/AnirbanBanik1998/Speak_and_Spell
- Participated in GSoC Heat contest in our college, and got selected for a mini experience of the actual GSoC. Completed the tasks given in my proposal successfully.
During entire period of GSoC I won't be having any academic duties to fulfill since the summer break would be pretty long. Thus i can devote my whole time to the project. This will in-turn help me in increasing my experience building projects on open-source hardware, and open-source in general.
I would enjoy working on this project and pour in all of the hardwork and passion required.
This device has always been aimed at pre-schoolers, and they will surely find it intriguing as well as educative. The new added functionalities will provide for more robust usage of this device.
What community members speak
A voice recognition SnS would be kinda cool. Jason Kridner(jkridner[m])
This could help with learning english, but could also be extended to other languages. ie have a chinese speak-and-spell, french speak-and-spell, etc. Erik Welsh(erik.welsh)
I will be active in the community, contributing to more open-source projects, gaining more valuable experience, and help newcomers get acquainted with beaglebone in general, the way I was helped when I first asked out in the group.