PDA

View Full Version : کمک در مورد فارسی کردن Text To Speech



movaffag
یک شنبه 01 شهریور 1388, 23:44 عصر
با سلام خدمت کاربران محترم خواهشمندام در مورد فارسی کردن Text To Speech در دلفی به من کمک کنید.:عصبانی++:

من با Text To Speech کار کردم ولی الان در نرم افزارم نیاز به تلفظ فارسی دارم .:اشتباه:

اگر ممکن باشد به من کمک کنید!:ناراحت:

K.Mohammadreza
دوشنبه 02 شهریور 1388, 17:29 عصر
دوست عزيز من تمام تجربياتم دراين زمينه اينها هستن که خدمتتون عرض ميکنم
1- براي اولين بار که ميخواستم برنامه Speech بنويسم با جستجو در اينترنت به سايت زير رسيدم
http://www.exceletel.com/
در اسن سايت علاوه بر کامپوننتهاي کاربردي و بدرد بخوري که داشت يک راهنماي انگليسي جهت نصب و راه اندازي Speech SDK‌ در دلفي هم داشت که علاوه بر لينک دانلود اين برنامه راهنماي نصب و استفاده از آن را نيز توضيح داده بود شما نيز مي توانيد به اين سايت برويد و اطلاعات مورد نظرتون را پيدا کنيد من قبلا يک برنامه تلفن گويا نوشتم که بحاي وارد کردن شماره با صدا کار مي کند مثلا بجاي زدن دکمه 1 براي اتصال به اتاق مديريت بايد کلمه مدير تلفظ شود تا به اتاق مدير وصل شود
اسم صفحه که در کامپيوترم ذخيره کردم و از سايت فوق دريافتش کردم
TAPI and SAPI Speech by ExceleTel Page 2.htm

و برخي از مطالبش

Getting Started


Here is a checklist of what you will need when you are ready to use what we are showing you here. You can follow the links to download the software.


ExceleTel TeleTools (free trial will do)

Microsoft SAPI 5.1 SDK (http://www.microsoft.com/speech/download/sdk51/)

First we will install the SAPI speech SDK. You can find it on the Microsoft site here at http://www.microsoft.com/speech/download/sdk51/ Be forewarned, it is a 68Mb download. It's an easy install, all you have to do is run the EXE and you have all the SAPI DLL's, sample programs and documentation. Delphi users will appreciate that there is a type library included to make life even easier with 19 components!
Next, you must tell your development environment that you want to use the files it has installed onto your computer in your project. Here are the instructions for a few languages. You will have to consult the documentation on adding references in your environment if it is not in this list.


Visual Basic 6

VB.Net

Delphi 5, 7 (http://www.exceletel.com/support/whtpapers/speech/delphi.htm)

Once you get everything loaded, you can access the speech objects. In order to understand how Microsoft put this together, it's useful to understand the concept of "tokens". A token is an object representing a resource that is available on a computer, such as a voice, recognizer, or an audio input device. A token provides an application an easy way to inspect the various attributes of a resource without having to instantiate it. The tokens are stored in the registry. For example the ISpeechObjectTokens for the voice object contains enumerators for voice, vendor, age, language and gender. Using tokens, you can find only the "female" voices, or find which voices are that of a "child" or only get voices in "spanish".Token enumerators are COM objects that enumerate the necessary entries for the tokens under it. Another example would be the AudioOutput tokens that give a listing of all the available audio output devices and parameters associated with them.
file:///N:/New%20Folder/gfgfgfdg/TAPI%20and%20SAPI%20Speech%20by%20ExceleTel%20Page %202_files/nycsubwaytoken.jpg
Setting Wave Formats


Let's start with wave files since speech is generated as a real-time audio stream in wave format. Wave files are defined in the list of SAPI constants found in your Speech library file and are in the following form:
SAFTDefault = -1
SAFTNoAssignedFormat = 0
SAFTText = 1
SAFTNonStandardFormat = 2
SAFTExtendedAudioFormat = 3
SAFT8kHz8BitMono = 4
SAFT8kHz8BitStereo = 5
SAFT8kHz16BitMono = 6
SAFT8kHz16BitStereo = 7
SAFT11kHz8BitMono = 8
SAFT11kHz8BitStereo = 9
...
You will notice that the formats start with SAFT for "Speech Audio Format Type" and then a string of numbers and letters. What do they stand for? Well the 8kHz, 11kHz, etc. stand for the sampling rate. This is the rate in Hertz or cycles per second that we "sampled" the audio. In general, you want to sample using at least twice the highest frequency you wish to sample. So 8kHz, or 8000Hz is just about perfect to sample the low quality phone audio which has a cutoff frequency of 3500Hz.
The number of bits indicate how many bits are used to store the information. A higher bit rate will have more resolution per sample. 8 bits can hold 256 levels of granularity while 16 bits can resolve 65536! Normally, the higher the sampling rate and the more bits used to store the sampled data, the better the audio quality. But keep in mind what your telephony device is designed to accommodate and the limitations of the type of line over which you will be sending your audio. Unless you are using digital lines on an in-house system with phones capable of extended dynamic range, the normal 8k and 11k mono formats are all you will ever need. This will give you better performance and allow for smaller wave files.
The constants listed above are the wave format enumerations available in the SpeechAudioFormatType token. There are about 68 formats currently. You can access these by their constant name or by the their index number. This makes it very easy to access individual wave format or populate something with an index, like a combobox, with all the wave formats. But what if you don't want all of them? What if you only want the mono wave formats? Well to do that, you have two choices, you could use the constants by name like this. The code is very similar for Delphi and VB.NET:
MyWavFormatType := SAFT11kHz8BitMono;
But since constants are only used at compile time, you can't put them in an array or a combox and reference them later. So for now, it's enough that you know you can use the SAPI wave format constants if you want to, but we will focus on how to manage a large list of wave files and refer to them by their index.
You could reference the wave format types by their constant name or by any text name you choose by creating an object and storing the string name of the wave format with it's index. In this way, the combobox item index of the wave format will match the ID in the constants list. Here is the code:
In Delphi...


ComboBoxWaveFormats.Items.AddObject('SAFT8kHz8BitM ono', TObject(4));
ComboBoxWaveFormats.Items.AddObject('SAFT8kHz16Bit Mono', TObject(6));
ComboBoxWaveFormats.Items.AddObject('SAFT11kHz8Bit Mono', TObject(8));
Notice I am only getting the even numbered wave formats, the ones in mono, and storing them with their index. In Visual Basic (VB 5 or VB 6), we could create a function called addfmts like this:
In VB...


Private Sub AddFmts(ByRef name As String, ByVal fmt As SpeechAudioFormatType)
' Use the Constants in the SAPI SpeechLib file globals section
' fill the ComboWaveFormat box with the format name and it's index
Dim Index As String
' get the count of existing list so that we are adding to the bottom of the list
Index = ComboWaveFormat.ListCount
' add the name to the list box and associate the format type with the item
ComboWaveFormat.AddItem name, Index
ComboWaveFormat.ItemData(Index) = fmt
End Sub
and then populate a list like this:
AddFmts "SAFT8kHz16BitMono", SAFT8kHz16BitMono
AddFmts "SAFT11kHz8BitMono", SAFT11kHz8BitMono
AddFmts "SAFT11kHz16BitMono", SAFT11kHz16BitMono

This is what you would do if you want to access the wave formats directly, but TeleTools has it's own constants list, which not coincidentally matches the order of the Microsoft list. This is all derived from the windows Multimedia sound specification. The benefit to using TeleTools is to maintain consistency if your program and be able to use our etPlay and etRecord component along with SAPI to set your wave formats. In addition, TeleTools allows you to refer to both the name and ID of wave formats the same way it does for things like TAPI devices.
Here are a few by ID, they all start with 'w':
wfUnknown = 0
wfPCM08000M08 = 1
wfPCM08000S08 = 2
wfPCM08000M16 = 3
wfPCM08000S16 = 4
and here are the same ones by a friendly name, they all start with 'S':
SwfUnknown = "Unknown"
SwfPCM08000M08 = "PCM 8,000 Hz, 8-bit, Mono"
SwfPCM08000S08 = "PCM 8,000 Hz, 8-bit, Stereo"
SwfPCM08000M16 = "PCM 8,000 Hz, 16-bit, Mono"
SwfPCM08000S16 = "PCM 8,000 Hz, 16-bit, Stereo"

So now you could create a list of audio wave formats by name like this:
ComboBoxWaveFmtNames.Items.Add('PCM 8,000 Hz, 8-bit, Mono');
ComboBoxWaveFmtNames.Items.Add('PCM 8,000 Hz, 16-bit, Mono');
ComboBoxWaveFmtNames.Items.Add('PCM 11,025 Hz, 8-bit, Mono');


or by name for display but referred to programatically by ID like this:
ComboBoxWaveFmtIDs.Items.AddObject('wfPCM08000M08' , TObject(1));
ComboBoxWaveFmtIDs.Items.AddObject('wfPCM08000M16' , TObject(3));
ComboBoxWaveFmtIDs.Items.AddObject('wfPCM11025M08' , TObject(5));


What this can lead to is code as simple as the following where we create a variable of type TWAVFORMATS (a TeleTools type), populate a list with all the wave formats in one line of code and then set our desired wave format with only one more line of code:
procedure TForm1.MyProcedure(Sender: TObject);
var X: TWAVFORMATS;
begin
for X := wfUnknown to wfIMAADPCM08000M04 do
ComboBox1.Items.Add(ET_WAV_FOMAT[X].sName);
end;

procedure TForm1.SetFormats(Sender: TObject);
begin
etRecord1.Source.Format.ID := TWAVFORMATS(ComboBox1.ItemIndex)
end;
ExceleTel created a sample program just to show you how to work with wave files, it's called etWaveFormats and you can download it HERE.
Now that we have learned how to get and set wave formats, lets see how to put in into practice as far as TAPI telephony and SAPI speech are concerned. Click below to continue to the next page where we will see how to

AbiriAmir
دوشنبه 02 شهریور 1388, 20:30 عصر
میبخشید
میشه یکم راجع بهش توضیح بدین؟
در ضمن با این کار میشه فارسی تلفظ کنه؟
خب اینجوری فقط میگه، تو تلفن گویا چطوری پشت تلفن بگه؟
در ضمن این کارش گفتنه یا تشخیص دادن؟
راجع به تلفن گویا هم یه سوال دارم که بعد ازتون میپرسم

AbiriAmir
دوشنبه 02 شهریور 1388, 20:34 عصر
راجع به تلفن گویا سوالم اینه
http://barnamenevis.org/forum/showthread.php?p=786476
اگه لطف کنید ممنون میشم

عقاب سیاه
شنبه 07 شهریور 1388, 08:13 صبح
ببخشید می شود یه سورس کد از این تلفضات فارسی برایمان بگزارید؟