fastgedcom.parser
Functions to parse gedcom files into Document.
On module import, register the ansel and gedcom codecs from the ansel python library.
Module Contents
Classes
Base warning class. |
|
Warn about a line with a single word. |
|
Warn about a cross-reference identifier that is defined twice. |
|
Warn about a line without correct parent line. |
|
Warn about an unparsable line level. Failed to parse it to an integer. |
|
Warn about an empty line. |
|
Warn about the presents of a 1-character-long line. |
Functions
|
Parse the text input to create a |
|
Return the guessed encoding of the |
|
Open and parse the gedcom file. |
Attributes
- class fastgedcom.parser.LineParsingWarning[source]
Bases:
ParsingWarningWarn about a line with a single word. There should be at least a line level and a tag.
- class fastgedcom.parser.DuplicateXRefWarning[source]
Bases:
ParsingWarningWarn about a cross-reference identifier that is defined twice.
- class fastgedcom.parser.LevelInconsistencyWarning[source]
Bases:
ParsingWarningWarn about a line without correct parent line.
- class fastgedcom.parser.LevelParsingWarning[source]
Bases:
ParsingWarningWarn about an unparsable line level. Failed to parse it to an integer.
- class fastgedcom.parser.EmptyLineWarning[source]
Bases:
ParsingWarningWarn about an empty line.
- class fastgedcom.parser.CharacterInsteadOfLineWarning[source]
Bases:
ParsingWarningWarn about the presents of a 1-character-long line. This happens when the object parsed is an iterable on characters, whereas an iterable on lines is expected.
- fastgedcom.parser.parse(lines: Iterable[str]) tuple[fastgedcom.base.Document, list[ParsingWarning]][source]
Parse the text input to create a
Documentobject.When a malformed line is encountered, a warning is created and we pass continue with the next line. Only
CharacterInsteadOfLineWarningstops the parsing. If other warnings occur, the parsing continues with the next line. ForLevelInconsistencyWarning, the line is still inserted in the tree.Return the
Documentand the list ofParsingWarningencountered.
- fastgedcom.parser.guess_encoding(file: str | pathlib.Path) str | None[source]
Return the guessed encoding of the
file. None if unknown.A gedcom should precise its encoding in the header under the tag CHAR.
However, indication of that field are often misleading or incomplete. For example: - ANSEL refers to the gedcom version of the ansel charset. - The use of a BOM mark is recommended but not stated, and not automatically handled by Python. - UNICODE refers to UTF-16.
- exception fastgedcom.parser.ParsingError[source]
Bases:
ExceptionError raise by
strict_parse().
- exception fastgedcom.parser.NothingParsedError[source]
Bases:
ParsingErrorRaised by
strict_parse()when the resulting document is empty.
- exception fastgedcom.parser.MalformedError[source]
Bases:
ParsingErrorRaised by
strict_parse()when there is warnings.- warnings: list[ParsingWarning][source]
- fastgedcom.parser.strict_parse(file: str | pathlib.Path) fastgedcom.base.Document[source]
Open and parse the gedcom file. Return the
Documentrepresenting the gedcom file.Raise
NothingParsedwhen the input is empty or isn’t gedcom. RaiseMalformedErrorwhen an error occurs in the parsing process.