== Compile HTK ==

- Enter 32-bit chroot

- Edit configure.in, comment out:

bindir=${bindir}.${host_cpu}
libdir=${libdir}.${host_cpu}

- Run autoconf:

$ autoconf
 
- Run configure:

$ export CPU=i386
$ ./configure CC=gcc-3.3

- Compile (as root, unfortunately)

$ sudo make

- Make chroot symlinks:

$ cd /chroot/usr/local/bin
$ for f in {H*,Cluster,L*}; do sudo ln -s $f ${f}32; done
$ cd /usr/local/bin
$ for f in /chroot/usr/local/bin/{H*,Cluster,L*}32; do sudo ln -s /usr/local/bin/do_dchroot `basename $f`; done



== Create general dictionary ==

Using a slightly modified SAMPA, http://www.phon.ucl.ac.uk/home/sampa/swedish.htm
See transcription.txt.

Hand-wrote numerals-swe.dict


== Create training and test utterances ==

echo 'gr -number=200 | l | wf random_utts.txt' | gf grammar/NumeralsSwe.gf

grep -v '^$' random_utts.txt | nl -v0 -n rz -s": " | head -n100 > train_utts.txt

grep -v '^$' random_utts.txt | nl -v0 -n rz -s": " | tail -n100 > test_utts.txt

== Record data ==

ghc -package hsshellscript --make -o prompt prompt.hs

cd train_data
HSLab &
../prompt ../train_utts.txt


== Create word list ==

echo 'pg -printer=fullform | wf fullform.txt' | gf grammar/NumeralsSwe.gf

perl -pe 's/([^:]*)\s:.*/$1/' fullform.txt > wordlist.txt


== Create application dictionary and monophone list ==

$ HDMan32 -m -w wordlist.txt -n monophones1 -l dlog dict.nosil numerals-swe.dict
$ echo '!SILENCE    sil' > dict
$ cat dict.nosil >> dict
$ if ! grep -q '^sil$' monophones1; then echo sil >> monophones1; fi




== Create transcription file ==

- Create mkphones0.led:

EX
IS sil sil
DE sp

- Create word-level MLF file for training utterances:

$ ./prompts2mlf train_utts.txt > train_utts_words.mlf

- Create phone-level MLF file for training utterances:

$ HLEd32 -l '*' -d dict -i train_utts_phone.mlf mkphones0.led train_utts_words.mlf

== Parametrize the data ==

- Create the file param_config:

# Coding parameters
TARGETKIND = MFCC_0
TARGETRATE = 100000.0
SAVECOMPRESSED = T
SAVEWITHCRC = T
WINDOWSIZE = 250000.0
USEHAMMING = T
PREEMCOEF = 0.97
NUMCHANS = 26
CEPLIFTER = 22
NUMCEPS = 12
ENORMALISE = F

- Create directory for storing parameterized inputs:

$ mkdir train_param

- Create a script file for HCopy:

$ for u in train_data/*; do echo $u train_param/`basename $u`.mfc; done > codetr.scp

- Do the parametrization:

$ HCopy32 -T 1 -C param_config -S codetr.scp


== Create monophone HMMS ==


- Create the file proto.

- Create train.scp with the list of training files:

$ ls train_param/*.mfc > train.scp

- Create a config file for the training:

$ cp param_config train_config

- Change train_config, setting:

TARGETKIND = MFCC_0_D_A

- Calculate initial Gaussians:

$ mkdir hmm0
$ HCompV32 -C train_config -f 0.01 -m -S train.scp -M hmm0 proto

- Create hmm0/macros file:

$ echo '~o <VecSize> 39 <MFCC_0_D_A>' > hmm0/macros
$ cat hmm0/vFloors >> hmm0/macros

- Create monophones0:

$ egrep -v '^sp$' monophones1 > monophones0

- Create hmm0/hmmdefs:

$ ./proto2hmmdefs hmm0/proto monophones0 > hmm0/hmmdefs

- Re-estimate:

$ mkdir hmm1
$ HERest32 -C train_config -I train_utts_phone.mlf -t 250.0 150.0 1000.0 -S train.scp -H hmm0/macros -H hmm0/hmmdefs -M hmm1 monophones0

$ mkdir hmm2
$ HERest32 -C train_config -I train_utts_phone.mlf -t 250.0 150.0 1000.0 -S train.scp -H hmm1/macros -H hmm1/hmmdefs -M hmm2 monophones0

$ mkdir hmm3
$ HERest32 -C train_config -I train_utts_phone.mlf -t 250.0 150.0 1000.0 -S train.scp -H hmm2/macros -H hmm2/hmmdefs -M hmm3 monophones0

== Fixing the silence models ==

- Add sp model:

$ mkdir hmm4
$ cp hmm3/macros hmm4/macros
$ ./makespmodel.pl < hmm3/hmmdefs > hmm4/hmmdefs

- Create sil.hed:

AT 2 4 0.2 {sil.transP}
AT 4 2 0.2 {sil.transP}
AT 1 3 0.3 {sp.transP}
TI silst {sil.state[3],sp.state[2]}

- Add extra transitions and tie sp to sil:

$ mkdir hmm5
$ HHEd32 -H hmm4/macros -H hmm4/hmmdefs -M hmm5 sil.hed monophones1

- Re-estimate:

$ mkdir hmm6
$ HERest32 -C train_config -I train_utts_phone.mlf -t 250.0 150.0 1000.0 -S train.scp -H hmm5/macros -H hmm5/hmmdefs -M hmm6 monophones1

$ mkdir hmm7
$ HERest32 -C train_config -I train_utts_phone.mlf -t 250.0 150.0 1000.0 -S train.scp -H hmm6/macros -H hmm6/hmmdefs -M hmm7 monophones1



== Re-align the training data ==

- Re-align

$ HVite32 -l '*' -o SWT -b '!SILENCE' -C train_config -a -H hmm7/macros -H hmm7/hmmdefs -i train_utts_aligned.mlf -m -t 250.0 -y lab -I train_utts_words.mlf -S train.scp dict monophones1

- Generate train-aligned.scp:

NOTE: after realignment, not all input files have an entry in the MLF
      file, so we need to generate a new train.scp

$ perl -ne 'if (s/^"\*\/(utt\d+)\.lab"$/$1/){chomp;print "train_param/$_.mfc\n";}' train_utts_aligned.mlf > train-aligned.scp



- Re-estimate:

$ mkdir hmm8
$ HERest32 -C train_config -I train_utts_aligned.mlf -t 250.0 150.0 1000.0 -S train-aligned.scp -H hmm7/macros -H hmm7/hmmdefs -M hmm8 monophones1

$ mkdir hmm9
$ HERest32 -C train_config -I train_utts_aligned.mlf -t 250.0 150.0 1000.0 -S train-aligned.scp -H hmm8/macros -H hmm8/hmmdefs -M hmm9 monophones1


== Create tied-state triphones ==

=== Create list of all triphones in dictionary ===

- Create mktridict.ded:

TC

- Create triphone list from dictionary:

$ HDMan32 -b sp -g mktridict.ded -n triphones1 /dev/null dict


=== Annotate training data with triphones ===

- Create mktri.led:

WB sp
WB sil
TC

- Create triphone MLF:

$ HLEd32 -l '*' -i train_utts_wintri.mlf mktri.led train_utts_aligned.mlf

=== Clone monophone models ===

- Create mktri.hed:

$ ./maketrihed monophones1 triphones1 > mktri.hed

- Clone monophone models to make triphone models:

$ mkdir hmm10
$ HHEd32 -H hmm9/macros -H hmm9/hmmdefs -M hmm10 mktri.hed monophones1

=== Reestimate using triphone data ===

$ mkdir hmm11
$ HERest32 -B -C train_config -I train_utts_wintri.mlf -t 250.0 150.0 1000.0 -S train-aligned.scp -H hmm10/macros -H hmm10/hmmdefs -M hmm11 triphones1

$ mkdir hmm12
$ HERest32 -B -C train_config -I train_utts_wintri.mlf -t 250.0 150.0 1000.0 -S train-aligned.scp -H hmm11/macros -H hmm11/hmmdefs -M hmm12 triphones1



== Test ==