fastgedcom.parser
Functions to parse gedcom files into Document
.
On module import, register the ansel and gedcom codecs from the ansel python library.
Module Contents
Classes
Base warning class. |
|
Warn about a line with a single word. |
|
Warn about a cross-reference identifier that is defined twice. |
|
Warn about a line without correct parent line. |
|
Warn about an unparsable line level. |
|
Warn about an empty line. |
|
Warn about the presents of a 1-character-long line. |
Functions
|
Parse the text input to create the |
|
Return the guessed encoding of the |
|
Open and parse the gedcom file. |
Attributes
- exception fastgedcom.parser.ParsingError[source]
Bases:
Exception
Error raise by
strict_parse()
.
- exception fastgedcom.parser.NothingParsed[source]
Bases:
ParsingError
Raised by
strict_parse()
when the resulting document is empty.
- class fastgedcom.parser.LineParsingWarning[source]
Bases:
ParsingWarning
Warn about a line with a single word. There should be at least a line level and a tag.
- class fastgedcom.parser.DuplicateXRefWarning[source]
Bases:
ParsingWarning
Warn about a cross-reference identifier that is defined twice.
- class fastgedcom.parser.LevelInconsistencyWarning[source]
Bases:
ParsingWarning
Warn about a line without correct parent line.
- class fastgedcom.parser.LevelParsingWarning[source]
Bases:
ParsingWarning
Warn about an unparsable line level.
- class fastgedcom.parser.EmptyLineWarning[source]
Bases:
ParsingWarning
Warn about an empty line.
- class fastgedcom.parser.CharacterInsteadOfLineWarning[source]
Bases:
ParsingWarning
Warn about the presents of a 1-character-long line. This happens when the object parsed is an iterable on characters, whereas an iterable on lines is expected.
- fastgedcom.parser.parse(lines: Iterable[str]) tuple[fastgedcom.base.Document, list[ParsingWarning]] [source]
Parse the text input to create the
Document
object.List of possible
ParsingWarning
:Only
CharacterInsteadOfLineWarning
stops the parsing. If other warnings occur, the parsing continues with the next line.
- fastgedcom.parser.guess_encoding(file: str | pathlib.Path) str | None [source]
Return the guessed encoding of the
file
. None if unknown.A gedcom should precise its encoding in the header under the tag CHAR.
However, indication of that field are often misleading or incomplete. For example: - ANSEL refers to the gedcom version of the ansel charset. - The use of a BOM mark is recommended, but not stated, and not automatically handled by Python. - UNICODE refers to UTF-16.
- fastgedcom.parser.strict_parse(file: str | pathlib.Path) fastgedcom.base.Document [source]
Open and parse the gedcom file. Return the
Document
representing the gedcom file.Raise
ParsingError
when an error occurs in the parsing process. RaiseNothingParsed
when the input is empty or isn’t gedcom.