fastgedcom.parser ================= .. py:module:: fastgedcom.parser .. autoapi-nested-parse:: Functions to parse gedcom files into :py:class:`.Document`. On module import, register the ansel and gedcom codecs from the `ansel python library `_. Attributes ---------- .. autoapisummary:: fastgedcom.parser.IS_ANSEL_INSTALLED Exceptions ---------- .. autoapisummary:: fastgedcom.parser.ParsingError fastgedcom.parser.NothingParsedError fastgedcom.parser.MalformedError Classes ------- .. autoapisummary:: fastgedcom.parser.ParsingWarning fastgedcom.parser.LineParsingWarning fastgedcom.parser.DuplicateXRefWarning fastgedcom.parser.LevelInconsistencyWarning fastgedcom.parser.LevelParsingWarning fastgedcom.parser.EmptyLineWarning fastgedcom.parser.CharacterInsteadOfLineWarning Functions --------- .. autoapisummary:: fastgedcom.parser.parse fastgedcom.parser.guess_encoding fastgedcom.parser.strict_parse Module Contents --------------- .. py:data:: IS_ANSEL_INSTALLED :value: False .. py:class:: ParsingWarning Base warning class. .. py:class:: LineParsingWarning Bases: :py:obj:`ParsingWarning` Warn about a line with a single word. There should be at least a line level and a tag. .. py:attribute:: line_number :type: int .. py:attribute:: line_content :type: str .. py:class:: DuplicateXRefWarning Bases: :py:obj:`ParsingWarning` Warn about a cross-reference identifier that is defined twice. .. py:attribute:: xref :type: fastgedcom.base.XRef .. py:class:: LevelInconsistencyWarning Bases: :py:obj:`ParsingWarning` Warn about a line without correct parent line. .. py:attribute:: line_number :type: int .. py:attribute:: line_content :type: str .. py:class:: LevelParsingWarning Bases: :py:obj:`ParsingWarning` Warn about an unparsable line level. Failed to parse it to an integer. .. py:attribute:: line_number :type: int .. py:attribute:: line_content :type: str .. py:class:: EmptyLineWarning Bases: :py:obj:`ParsingWarning` Warn about an empty line. .. py:attribute:: line_number :type: int .. py:class:: CharacterInsteadOfLineWarning Bases: :py:obj:`ParsingWarning` Warn about the presents of a 1-character-long line. This happens when the object parsed is an iterable on characters, whereas an iterable on lines is expected. .. py:attribute:: line_number :type: int .. py:function:: parse(lines: Iterable[str]) -> tuple[fastgedcom.base.Document, list[ParsingWarning]] Parse the text input to create a :py:class:`.Document` object. When a malformed line is encountered, a warning is created and we pass continue with the next line. Only :py:class:`.CharacterInsteadOfLineWarning` stops the parsing. If other warnings occur, the parsing continues with the next line. For :py:class:`.LevelInconsistencyWarning`, the line is still inserted in the tree. Return the :py:class:`.Document` and the list of :py:class:`.ParsingWarning` encountered. .. py:function:: guess_encoding(file: str | pathlib.Path) -> str | None Return the guessed encoding of the ``file``. None if unknown. A gedcom should precise its encoding in the header under the tag CHAR. However, indication of that field are often misleading or incomplete. For example: - ANSEL refers to the gedcom version of the ansel charset. - The use of a BOM mark is recommended but not stated, and not automatically handled by Python. - UNICODE refers to UTF-16. .. py:exception:: ParsingError Bases: :py:obj:`Exception` Error raise by :py:func:`.strict_parse`. .. py:exception:: NothingParsedError Bases: :py:obj:`ParsingError` Raised by :py:func:`.strict_parse` when the resulting document is empty. .. py:exception:: MalformedError Bases: :py:obj:`ParsingError` Raised by :py:func:`.strict_parse` when there is warnings. .. py:attribute:: warnings :type: list[ParsingWarning] .. py:function:: strict_parse(file: str | pathlib.Path) -> fastgedcom.base.Document Open and parse the gedcom file. Return the :py:class:`.Document` representing the gedcom file. Raise :py:exc:`.NothingParsed` when the input is empty or isn't gedcom. Raise :py:exc:`.MalformedError` when an error occurs in the parsing process.