Red Herring, the leading technology media group that covers startup companies, recently announced their 100 finalists for the coveted Red Herring Europe award. One of the finalists is Spanish company Ta with you, a startup from Barcelona that is developing SMT (Statistic Machine Translation) solutions.
The company website, http://www.tauyou.com, advertises two types of translation solutions:
* An SMS solution for translation of text between several languages. The user enters text as an SMS on his/her cellphone, specifies the language, sends it to a phone number in Spain and gets back the translation. Information on use of this solution is described on www.tauyou.com/en/sms.html. According to the Ta with you blog, this service is offered in partnership with Vodaphone.
* T-Image solution. The user takes a photo of some text using his/her cellphone, uploads it to the Ta with you website and gets back a translation of the text. This solution can be accessed on http://www.tauyou.com/m_new/image.php?idioma=en. Further information on this feature can be found on the Ta with you blog. A description of the T-Image API can be found on http://tauyou.3scale.net/.
Another solution which is touted on the Ta with you blog is real-time translation of TV show subtitles. This feature would work on a film that has subtitles in one language, making them available in other languages in real-time.
Disclaimer--the SMS and T-Image features did not work when tested by us. If anyone has experience with these services or can successfully test them, please let us know. Furthermore, Ta with you does not provide any access to their SMT software for purposes of evaluation or testing. The company provided a translation portal for the 2009 Mobile World Congress in Barcelona that combines machine translation and TTS playback (see it at http://gsma.mobitra.mobi/). However, this portal uses Google MT and does not use the Ta with you MT system.
The following is an interview with company CEO Dr. Diego Bartolome, which was conducted by e-mail on March 23, 2009. A bio of Dr. Bartolome can be found on http://younoodle.com/people/diego_bartolome.
GTS: Please provide more information about your proprietary SMT system. What
languages does it support? How accurate is it?
Diego Bartolome: Our proprietary SMT solution is a domain adapted statistical machine translation (SMT) platform, which is completed by pre- and post-processing steps to guarantee quality in particular verticals. Our technology has a human quality for a particular domain, thus significantly better than any other free system for any language pair.
Although we have a basic translation engine fed by freely available multilingual corpora from the Internet, as well as dictionaries and other resources, we basically work really close with our clients and make a system specifically designed to fulfill their needs. After an analysis of the domain and pairs of languages they require, we need at least 1 million words from their previously translated texts or translation memories for a particular client and domain e.g. medical, pharma, bank, news, computer, automotive, etc. The higher the amount of data for that domain, the better the overall quality of the system. This and additional data we generate feed the SMT engine, and we carry out the training for the domain to achieve almost human quality or that task.
Therefore, we work with any language and any domain, achieving a close-to-perfect accuracy for the domain.
GTS: Can the Enterprise server be deployed by the client? Do you sell this software? Do you provide remote service through SOAP interface or equivalent?
Diego Bartolome: Yes, domain-adapted SMT is mostly an enterprise-level solution. We build the system for the client as I have explained, and we can deploy either as a Software as a Service in an external server or within the client premises, which can be accessed through web interfaces or as any means desired by the customer, we are IT specialists.
However, we need access to the system to enhance the quality monthly.
The business model is a set-up fee plus monthly maintenance and support, and the price depends mostly on the client.
GTS: What is the throughput of your MT solution?
Diego Bartolome: For the domain adapted SMT, we can translate more than 40 million words per day with a single server, with full scalability. We can fulfill the language needs of any company, reduce their costs by more than 40% and decrease delivery times to the minimum.
GTS: Is there any way to test the Enterprise software? Can you provide a web page
for testing purposes to test text translation?
Diego Bartolome: Solutions for the customers are protected by NDAs, so we cannot offer them to other people. However, for sure you can test a Catalan to Spanish translator if it helps, just send an email to translator@mobitra.mobi, with Catalan to Spanish or Spanish to Catalan in the subject, and the text in the body. To see how it works, you can look at the Catalan - Spanish bilingual newspaper www.elperiodico.es / www.elperiodico.cat
GTS: On your website, you advertise a 'T-Image solution' which allows mobile phone users to take a photograph of text and upload it your website to get a translation. How does your server read the image, does it have an image to text engine? Does it have error correction software and how clear does the text have to be, how large does the text have to be?
Diego Bartolome: For our T-Image! product, it currently works fine if the customer 1) avoids text distortion and perspective, 2) holds still and uses good lighting, 3) turns off the flash, 4) fills the screen with the document (best if portrait or landscape), 5) takes a sharp and focused photo. Besides, it's recommended that the camera has autofocus capabilities if they want to take a picture of small texts (e.g. 12 or 14 points), otherwise the font size shall be very big (greater than 30 points). At this point, we do not support handwritten texts, although it might work depending on the way the clients write.
Our technology here is mostly related to the quality increase of the image and the robust recognition of the text in a natural scene. Since the product is in beta phase, we support the recognition and / or translation of printed materials such as newspapers, books, cards, menus, etc.
For the T-Image!, the average response time including the upload of the image is about 20 seconds.
GTS: How does the automatic subtitling solution work? Do you need at least one subtitle to exist? Does it do it in real-time? or do you need a pre-processing phase.
Diego Bartolome: The real-time subtitling requires at least one subtitle to exist, and it is done in real time. Therefore, if the TV channels deploy it, it will be something they preprocess and then provide the viewers with a set of subtitling options, and if it's deployed at home, the subtitling will be always done in real-time. Since it's an ongoing project, we prefer not to disclose many details ...
GTS: Thanks very much for this interview.
7:52:51 PM
|