markdown.obsidian.personal.machine_learning.definition_and_notation_naming
trouver.markdown.obsidian.personal.machine_learning.tokenize.def_and_notat_token_classification
has functions for gathering and processing data to train and for using ML models to identify definitions and notations introduced in notes via token classification. Identified definition and notations are marked using HTML tags. It would be convenient to predict the “names” for these definitions and notations.
TODO: insert examples of definitions and notations with HTML tags and examples of what the “names” of these definitions and notation should be
Gather ML data from information notes
data_from_information_note
data_from_information_note (info_note:trouver.markdown.obsidian.vault.Va ultNote)
*Obtain data for naming definitions and notations for a standard information note.
Definitions and notations should be marked by HTML tags (see markdown.obsidian.personal.machine_learning.tokenize.def_and_notat_token_classification
). - A definition is to be marked by an HTML tag with a definition
attribute, which is the definition’s “name”, i.e. words and/or phrases describing what the definition is called and to what objects/situations the definition is applicable. If multiple combinations of words/phrases are appropriate, then they are separated by a single semicolon ;
. If the definition
attribute is ""
, then the definition name has not been marked, both manually and automatically. - A notation (technically the full LaTeX string in which the notation is introducedis) is to be marked by an HTML tag with a notation
attribute, which is the notation’s “name”, i.e. the actual notation introduced in the LaTeX string (without surrounding dollar signs ($
or $$
)). If multiple notations are appropriate, then they are separated by double semicolons ;;
. If the notation
attribute is ""
, then it means that either the notation has not been marked, or that the LaTeX string (minus the surrounding dollar signs) is exactly the introduced notation.
Returns - list[dict[str, str]] - Each dict corresponds to a single datapoint, which holds the data of the naming of a single definition or notation (latex str) introduced in info_note
. The keys are 'text'
and 'definition
’ or 'notation
’. The text
entry should be the processed text of info_note
, see process_standard_information_note
*
Type | Details | |
---|---|---|
info_note | VaultNote | The standard information note from which to draw data. |
Returns | list | Each dict corresponds to a single datapoint, which holds the data of the naming of a single definition or notation (latex str) introduced in info_note . |
Use the ML model
predict_names
predict_names (info_note:trouver.markdown.obsidian.vault.VaultNote, def_a nd_notat_pipeline:Optional[transformers.pipelines.text2tex t_generation.SummarizationPipeline], def_pipeline:Optional [transformers.pipelines.text2text_generation.Summarization Pipeline], notat_pipeline:Optional[transformers.pipelines. text2text_generation.SummarizationPipeline])
*Predict the names of the definitions and notations using the trained ML models
Either def_and_notat_pipeline
or both def_pipeline
and notat_pipeline
should be provided.*
Type | Details | |
---|---|---|
info_note | VaultNote | |
def_and_notat_pipeline | Optional | A pipeline wrapping an ML model which predicts the naming of both definition and notations. |
def_pipeline | Optional | A pipeline wrapping an ML model which predicts the naming of definitions. |
notat_pipeline | Optional | A pipeline wrapping an ML model which predicts the naming of notations. |
Returns | list |
Naming notation notes
Another convenient functionality is to name notation notes automatically.
add_autogen_name_to_notation_note
add_autogen_name_to_notation_note (notation_note:trouver.markdown.obsidi an.vault.VaultNote, autogen_name:str)
sanitize_autogen_name
sanitize_autogen_name (autogen_name)
autogen_name_from_notation_note
autogen_name_from_notation_note (notation_note:trouver.markdown.obsidian .vault.VaultNote, pipeline)
predict_name_and_add_to_notation_note
predict_name_and_add_to_notation_note (notation_note:trouver.markdown.ob sidian.vault.VaultNote, notation_n ote_naming_pipeline:transformers.p ipelines.text2text_generation.Summ arizationPipeline, reference:str)
Predict an appropriate name for the notation note and add it in the YAML frontmatter metadata.