NAME
Lingua::Jspell - Perl interface to the Jspell morphological analyser.
SYNOPSIS
use
Lingua::Jspell;
my
$dict
= Lingua::Jspell->new(
"dict_name"
);
my
$dict
= Lingua::Jspell->new(
"dict_name"
,
"personal_dict_name"
);
$dict
->rad(
"gatinho"
);
# list of radicals (gato)
$dict
->fea(
"gatinho"
);
# list of possible analysis
$dict
->der(
"gato"
);
# list of derivated words
$dict
->flags(
"gato"
);
# list of roots and flags
FUNCTIONS
new
Use to open a dictionary. Pass it the dictionary name and optionally a personal dictionary name. A new jspell dictionary object will be returned.
nearmatches
This method returns a list of analysis for words that are near-matches to the supplied word. Note that although a word might exist, this method will compute the near-matches as well.
@nearmatches
=
$dictionary
->nearmatches(
'cavale'
);
To compute the list of words to analyze, the method uses a list of equivalence classes that are present on the SNDCLASSES
section of dictionaries yaml files.
It is also possible to specify a list of user-defined classes. These are supplied as a filename that contains, per line, the characters that are equivalent (with spaces separating them):
ch x
ss ç
This example says that if a word uses ch
, then it can be replaced by x
for near-matches calculation. The inverse is also true.
If these rules are stored in a file named classes.txt
, you can supply this list with:
@nearmatches
=
$dictionary
->nearmatches(
'chaile'
,
rules
=>
'classes.txt'
);
setmode
$dict
->setmode({
flags
=> 0,
nm
=>
"off"
});
- af
-
(add flags) Enable parcial near misses, by using rules not officially associated with the current word. Does not give suggestions by changing letters on the original word. (default option)
- full
-
(add flags and change characters) Enable near misses, try to use rules where they are not applied, try to give suggestions by swapping adjacent letters on the original word.
- cc
-
(change characters) Enable parcial near misses, by swapping adjacent, inserting or modifying letters on the original word. Does not use rules not associated with the current word.
- off
-
Disable near misses at all.
fea
Returns a list of analisys of a word. Each analisys is a list of attribute value pairs. Attributes available: CAT, T, G, N, P, ....
@l
=
$dic
->fea(
$word
)
@l
=
$dic
->fea(
$word
,{...att. value pair restriction})
If a restriction is provided, just the analisys that verify it are returned.
flags
returns the set of morphological flag associated with the word. Each flag is related with a set of morphological rules.
@f
= flags(
"gato"
)
rad
Returns the list of all possible radicals/lemmas for the supplied word.
@l
=
$dic
->rad(
$word
)
der
Returns the list of all possible words using the word as radical.
@l
=
$dic
->der(
$word
);
onethat
Returns the first Feature Structure from the supplied list that verifies the Feature Structure Pattern used.
%analysis
= onethat( {
CAT
=>
'adj'
},
@features
);
%analysis
= onethat( {
CAT
=>
'adj'
},
$pt
->fea(
"espanhol"
));
allthat
Returns all Feature Structures from the supplied list that verifies the used Feature Structure Pattern.
@analyses
= allthat( {
CAT
=>
'adj'
},
@features
);
@analyses
= allthat( {
CAT
=>
'adj'
},
$pt
->fea(
"espanhol"
));
verif
Returns a true value if the second Feature Structure verifies the first Feature Structure Pattern.
if
(verif(
$pattern
,
$feature
) ) { ... }
Use a value of undef, or an empty string, in the pattern, to force that key not to exist:
if
(verif( {
FSEM
=>
undef
},
$feature
)) { .. }
nlgrep
@line
=
$d
->nlgrep( word , files);
@line
=
$d
->nlgrep( [word1, wordn] , files);
or with options to set a max number of entries, rec. separator, or tu use radtxt files format.
@line
=
$d
->nlgrep( {
max
=>100,
sep
=>
"\n"
,
radtxt
=>0} , pattern , files);
setstopwords
eagles
new_featags
featags
Given a word, returns a set of analysis. Each analysis is a morphosintatic tag
@l
=
$pt
->featags(
"lindas"
)
JFS , ...
@l
=
$pt
->featags(
"era"
,{
CAT
=>
"v"
})
## with a constraint
featagsrad
Given a word, returns a set of analysis. Each analysis is a morphosintatic tag and the lemma information
@l
=
$pt
->featagsrad(
"lindas"
)
JFS:lindo , ...
@l
=
$pt
->featagsrad(
"era"
,{
CAT
=>
"v"
})
## with a constraint
onethatverif
Given a pattern feature structure and a list of analysis (feature structures), returns a true value is there is one analysis that verifies the pattern.
# onethatverif( cond:fs , conj:fs-set) :: bool
# exists x in conj: verif(cond , x)
if
(onethatverif({
CAT
=>
"adj"
},
$pt
->fea(
"linda"
))) {
...
}
mkradtxt
isguess
Lingua::Jspell::isguess(
@ana
)
returns True if list of analisys are near misses (unknown attribut is 1).
any2str
Lingua::Jspell::any2str(
$ref
)
Lingua::Jspell::any2str(
$ref
,
$indentation
)
Lingua::Jspell::any2str(
$ref
,
"compact"
)
hash2str
AUTHOR
Jose Joao Almeida, <jj@di.uminho.pt>
Alberto Simões, <ambs@di.uminho.pt>
BUGS
Please report any bugs or feature requests to bug-lingua-jspell@rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Lingua-Jspell. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
COPYRIGHT & LICENSE
Copyright 2007-2009 Projecto Natura
This program is free software; licensed under GPL.