👂🎴 🕸️
<
p
>
As
of
2023
''
there
exists
no
publicly
available
ASR
model
which
could
accurately
and
reliably
process
child
speech
.
p
><
p
class
=
>
In
our
IHIET
2023
article
''
we
introduce
two
innovations
with
which
the
problem
can
be
partially
bypasssed
in
context
of
digitally
supported
reading
acquisition
app
:
p
><
ol
><
li
>
Transformation
of
a
generic
ASR
problem
into
a
sort
of
extended
multi
-
class
classification
problem
by
means
of
extending
a
generic
acoustic
model
with
a
domain
-
specific
''
minimalist
language
model
(“
scorer
”).
li
><
li
>
Human
-
machine
peer
learning
(
HMPL
)
whereby
the
artificial
utterence
-
processing
tutor
U
incrementally
and
gradually
adapts
its
parameters
to
a
particular
learner
''
a
human
individual
I
.
li
>
ol
><
p
>
In
concrete
terms
''
we
have
shown
that
after
three
sessions
focusing
on
acquisition
of
grapheme
-
vowel
and
CV
-
bigrapheme
correspondences
had
lead
''
in
case
of
one
particular
learner
''
to
decrease
of
WER
from
96
%
to
48
%.
p
>
dh
@
udk
-
berlin
.
de
<
br
/><
br
/>
https
://
fibel
.
digital
<
br
/><
br
/>
Slava
Ukrajine
<
p
class
=
fragment
>
Learner
1
(
L1
)
-
is
a
5
-
year
old
pre
-
school
bilingual
(
90
%
German
''
10
%
Slovak
)
daughter
of
the
main
author
of
this
article
p
><
p
class
=
fragment
>
three
HMPL
-
C2
exercise
1
(
E1
)
sessions
were
executed
on
days
1
''
3
and
5
of
the
study
p
><
p
class
=
fragment
>
each
HMPL
-
C2
-
E1
session
consisted
of
human
-
testing
phase
followed
by
a
mutual
human
-
machine
learning
phase
p
><
p
class
=
fragment
>
in
each
phase
''
sequences
consisted
of
5
repetitions
of
syllables
started
with
occlusive
labial
consonant
M
or
B
and
followed
by
the
vowel
A
''
E
''
I
''
O
or
U
''
thus
yielding
sequences
from
MA
MA
MA
MA
MA
to
BU
BU
BU
BU
BU
p
><
p
class
=
fragment
">
speech
recordings
collected
during
the
learning
phase
subsequently
provided
input
for
the
acoustic
-
model
fine
-
tuning
process
p
>
<
p
class
=
fragment
>
sequences
of
five
vowels
resp
.
CV
syllables
which
were
displayed
by
DP
were
considered
to
provide
the
reference
”;
output
of
the
model
yielded
the
hypotheses
p
><
p
class
=
fragment
>
data
provided
by
L1
during
three
testing
phases
on
days
1
''
3
and
5
were
evaluated
by
means
of
5
different
models
p
><
p
class
=
fragment
>
DeepSpeech
de
=
baseline
model
;
KIds
-
0
:
Deepspeech
de
fine
-
tuned
with
kidsTALC
;
KIds
-
L1
-
1
:
KIds
-
0
fine
-
tuned
with
data
provided
by
L1
during
day
1
learning
phase
;
KIds
-
L1
-
3
:
KIds
-
1
fine
-
tuned
with
data
provided
by
L1
during
day
3
learning
phase
;
KIds
-
L1
-
5
:
KIds
-
5
fine
-
tuned
with
data
provided
by
L1
during
day
5
learning
phase
p
><
p
>
NOTE
:
WER
-
decrease
between
rows
corresponds
to
increase
of
accuracy
of
the
ASR
model
;
WER
-
decrease
between
columns
points
to
increase
in
L1
'
s
reading
competence
p
>
 
<
p
class
=
fragment
>
in
majority
of
reading
exercises
which
are
included
in
the
Primer
we
already
know
the
text
in
advance
p
><
p
class
=
fragment
>
we
already
know
what
utterances
could
be
considered
as
correct
lectures
and
which
not
p
><
p
class
=
fragment
>
to
every
specific
exercise
-
like
vowel
or
syllable
recognition
Primer
associates
a
specific
language
model
(
scorer
)
which
constraints
the
connectionist
temporal
classification
(
CTC
)
beam
search
to
restricted
amount
of
exercise
-
relevant
answers
p
><
p
class
=
fragment
>
significantly
constraining
the
search
space
of
plausible
solutions
p
><
p
class
=
fragment
>
INNOVATION
1
:
implementation
of
exercise
-
specific
scorers
transforms
a
generic
ASR
problem
(
difficult
)
into
multi
-
class
classification
problem
(
easier
)
p
>
Human
-
Machine
Peer
Learning
(
HMPL
)
provides
a
paradigm
for
construction
of
such
human
-
machine
learning
curricula
from
which
both
humans
as
well
as
machines
benefit
.<
p
class
=
fragment
>
In
our
previous
HMPL
Curriculum
1
(
HMPL
-
C1
)
study
which
focused
on
extending
foreign
language
vocabulary
for
human
learners
and
increase
of
speech
-
recognition
accuracy
of
artificial
learners
''
we
have
observed
increase
in
amount
of
matches
between
expected
and
predicted
labels
which
was
caused
both
by
increase
of
human
learner
s
vocabulary
''
as
well
as
by
increase
of
recognition
accuracy
of
machine
s
speech
-
to
-
text
model
(
Hromada
&
Kim
''
<
em
>
Proof
-
of
-
concept
of
feasibility
of
human
machine
peer
learning
for
German
noun
vocabulary
learning
em
>
''
Frontiers
in
Education
''
2023
)
p
>
<
p
class
=
fragment
>
reading
is
essentially
a
process
of
translation
of
textual
sequences
into
their
phonetic
representations
p
><
p
class
=
fragment
>
spoken
word
thus
play
a
fundamental
role
in
reading
acquisition
p
><
p
class
=
fragment
>
highly
accurate
automatic
speech
recognition
(
ASR
)
systems
exist
for
many
languages
but
they
are
still
strongly
biased
towards
accurate
processing
of
adult
voices
p
><
p
class
=
fragment
>
HOWEVER
:
in
reading
acquisition
or
reading
fostering
scenarios
one
deals
with
speakers
whoseutterances
of
sequences
-
to
-
be
-
read
exhibit
peculiar
characteristics
p
>
<
p
class
=
fragment
>
majority
of
those
who
learn
how
to
read
are
children
p
><
p
class
=
fragment
>
children
are
physiologically
(
differences
in
size
 
and
anatomy
of
vocal
tract
;
teeth
change
)
and
cognitively
different
from
adults
p
><
p
class
=
fragment
>
children
voices
are
different
from
adult
voices
(
e
.
g
.
Fundamental
frequency
F
of
male
voice
=~
112
.
0
Hz
;
F
(
female
voice
)
=~
195
.
8
Hz
;
F
(
boy
voice
)
=~
250
.
0
Hz
;
F
(
girl
voice
)
=~
244
.
0
Hz
p
><
p
class
=
fragment
>
datasets
for
children
s
speech
that
are
publicly
available
are
quite
scarce
p
><
p
class
=
fragment
>
kidsTALC
(
Rumberg
et
al
''
2022
)
authors
report
26
.
2
%
word
-
error
-
rate
(
WER
)
of
typically
developing
monolingual
German
children
p
>
It
is
generally
believed
that
acquisition
of
reading
skill
(
s
)
can
be
fostered
(
resp
.
inhibited
)
by
 
learner
s
exposure
to
appropriate
(
resp
.
inappropriate
)
social
''
pedagogic
and
instrumental
context
.
It
is
also
believed
that
well
-
designed
digital
tools
may
also
help
children
learn
how
to
read
.
DRAA
'
s
foster
pupil
s
acquisition
of
reading
skills
there
where
other
 
 
tutor
 
– 
ideally
 
 
a
 
human
 
 
teacher
 
parent
 
or
 
peer
 
is
not
available
or
unable
to
help
the
child
to
master
a
cognitively
challenging
task
of
learning
how
to
read
“. 
Primer
is
a
post
-
smartphone
''
book
-
like
''
do
-
it
-
Yourself
educational
instrument
(
Bildunginstrument
).
<
div
>
Develop
an
open
-
source
software
suite
which
could
help
elementary
school
pupils
successfully
enter
the
world
of
basic
literacy
.
div
>
<
div
>
Increase
digital
competences
of
older
students
so
that
they
are
able
to
repair
or
ameliorate
existing
Primers
or
construct
their
new
copies
.
div
>
[Impressum, Datenschutz, Login] Other subprojects of wizzion.com linkring: baumhaus.digital fibel.digital teacher.solar udk.ai giver.eu naadam.info gardens.digital refused.science puerto.life kyberia.de