markdown.obsidian.personal.machine_learning.information_note_types

Functions for gathering machine learning data on the types of math information notes from tags and for using ML models trained on such data to predict typeso of math information notes.

Some common types of components in mathematical writing include: definitions, notations, concepts (e.g. theorems, propositions, corollaries, lemmas), proofs. The functions in this module gather data from labeled “standard information notes” (formatted in trouver’s standard formatting) in an Obsidian.md vault about the types of these notes. Such data can be used to train a categorization ML model to predict types of unlabeled notes.

The labels are done by Markdown tags in the notes’ YAML frontmatter meta (so tags in the body of the Markdown file, are ignored). For example, the note

---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _meta/definition]
---
# This is a title of a note[^1]

We could talk about many things. I like to talk about rings!

A **ring** is a set equipped with two binary operators $+$ and $\cdot$
such that...

# See Also

# Meta
## References
![[_reference_sample_reference]]

## Citations and Footnotes
[^1]: Author names, Some way to identify where this information comes from

has the tag #_meta/definition 1

LABEL_TAGS
['#_meta/concept',
 '#_meta/exercise',
 '#_meta/definition',
 '#_meta/example',
 '#_meta/narrative',
 '#_meta/notation',
 '#_meta/proof',
 '#_meta/remark',
 '#_meta/TODO/split',
 '#_meta/TODO/merge',
 '#_meta/TODO/delete',
 '#_meta/hint',
 '#_meta/how_to',
 '#_meta/conjecture',
 '#_meta/convention',
 '#_meta/context']

LABEL_TAGS above lists the tags for the note types that we would like to eventually train a model to predict. The following are the tags for which the author of trouver has ample labeled data:

note that the author of trouver has only trained a model that predicts some of the note types listed in LABEL_TAGS. Moreover, the accuracy of the predictions can widely depend amongst the different types.

It is often appropriate to label a single note with more than one of these tags. For example, a note containing the statement “We define the ring \(\mathbb{Z}/n\mathbb{Z}\) of integers modulo \(n\)” is both a definition note and a notation note because it both introduces notion of the ring of integers modulo \(n\) and gives notation for the ring.

from fastai.text.learner import TextLearner
from fastai.learner import load_learner
import pathlib
from pathlib import WindowsPath
import platform
import shutil
import tempfile
from unittest import mock

from fastcore.test import *
from torch import tensor

from trouver.helper import _test_directory
from trouver.markdown.obsidian.personal.notes import notes_linked_in_note

Gather and label data


source

note_is_labeled_with_tag

 note_is_labeled_with_tag (note:trouver.markdown.obsidian.vault.VaultNote,
                           label_tag:str)

Return True if the standard information note is labeled as begin a specified type.

Raises

  • ValueError
    • If label_tag does not include the beginning hashtag #.
Type Details
note VaultNote
label_tag str A tag which labels a type that note is. Includes the beginning hashtag #, e.g. #_meta/definition, #_meta/TODO/split
Returns bool True if note is labeled as type label_type.
sample_text = r"""---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _meta/definition]
---
# This is a title of a note[^1]

We could talk about many things. I like to talk about rings!

A **ring** is a set equipped with two binary operators $+$ and $\cdot$
such that...

# See Also

# Meta
## References
![[_reference_sample_reference]]

## Citations and Footnotes
[^1]: Author names, Some way to identify where this information comes from
"""
sample_mf = MarkdownFile.from_string(sample_text)

with mock.patch("__main__.MarkdownFile.from_vault_note", return_value=sample_mf) as mock_markdownfile_from_vault_note:
    mock_note = None
    # This is setup in such a way that the invocation to
    # `note_is_labeled_with_tag` will use
    # a note whose text is `sample_text`.
    assert note_is_labeled_with_tag(mock_note, '#_meta/definition')
    assert not note_is_labeled_with_tag(mock_note, '#_meta/notation')
    assert not note_is_labeled_with_tag(mock_note, '#_meta/concept')

    with ExceptionExpected(ValueError):
        # The argument to `label_tag` requires the starting hashtag `#`.`
        note_is_labeled_with_tag(mock_note, '_meta/definition')

source

note_labels

 note_labels (note:trouver.markdown.obsidian.vault.VaultNote)

Return a dict indicating what labels a note has.

The labels come from the LABEL_TAGS dict.

sample_text = r"""---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _meta/definition]
---
# This is a title of a note[^1]

We could talk about many things. I like to talk about rings!

A **ring** is a set equipped with two binary operators $+$ and $\cdot$
such that...

# See Also

# Meta
## References
![[_reference_sample_reference]]

## Citations and Footnotes
[^1]: Author names, Some way to identify where this information comes from
"""
sample_mf = MarkdownFile.from_string(sample_text)

with mock.patch("__main__.MarkdownFile.from_vault_note", return_value=sample_mf) as mock_markdownfile_from_vault_note:
    mock_note = None
    # This is setup in such a way that the invocation to
    # `note_labels` will use
    # a note whose text is `sample_text`.
    sample_output = note_labels(mock_note)
    test_eq(sample_output['#_meta/definition'], 'IS #_meta/definition')
    test_eq(sample_output['#_meta/concept'], 'NOT #_meta/concept')
    for label_tag in LABEL_TAGS:
        assert label_tag in sample_output
    print(sample_output)
{'#_meta/concept': 'NOT #_meta/concept', '#_meta/exercise': 'NOT #_meta/exercise', '#_meta/definition': 'IS #_meta/definition', '#_meta/example': 'NOT #_meta/example', '#_meta/narrative': 'NOT #_meta/narrative', '#_meta/notation': 'NOT #_meta/notation', '#_meta/proof': 'NOT #_meta/proof', '#_meta/remark': 'NOT #_meta/remark', '#_meta/TODO/split': 'NOT #_meta/TODO/split', '#_meta/TODO/merge': 'NOT #_meta/TODO/merge', '#_meta/TODO/delete': 'NOT #_meta/TODO/delete', '#_meta/hint': 'NOT #_meta/hint', '#_meta/how_to': 'NOT #_meta/how_to', '#_meta/conjecture': 'NOT #_meta/conjecture', '#_meta/convention': 'NOT #_meta/convention', '#_meta/context': 'NOT #_meta/context'}

The way that data for information note types should be obtained is fairly simple - for each note,


source

gather_information_note_types

 gather_information_note_types (vault:os.PathLike,
                                notes:list[trouver.markdown.obsidian.vault
                                .VaultNote])

Return a pandas.DataFrame encapsulating the data of note labels.

Type Details
vault PathLike
notes list
Returns DataFrame Has columns Time added, Time modified, Note name, Full note content, Processed note content as well as columns for each tag label. See append_to_information_note_type_database for more details about these columns.
test_vault = _test_directory() / 'test_vault_6'
index_note = VaultNote(test_vault, name='_index_1_introduction_reference_with_tag_labels')
# There are just 5 notes
notes = notes_linked_in_note(index_note, as_dict=False)
df = gather_information_note_types(test_vault, notes)
test_eq(len(df), 6)
df.head()
Time added Time modified Note name Full note content Processed note content #_meta/concept #_meta/exercise #_meta/definition #_meta/example #_meta/narrative ... #_meta/proof #_meta/remark #_meta/TODO/split #_meta/TODO/merge #_meta/TODO/delete #_meta/hint #_meta/how_to #_meta/conjecture #_meta/convention #_meta/context
0 2023-04-30T19:05 2023-04-30T19:05 reference_with_tag_labels_something_something ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/narrative]\n---\n# Topic[^1]\n\nIn this chapter, we describe some basics of ring theory. Rings are mathematical structures which generalize the structures of the familiar integers, rational numbers, real numbers, complex numberes, etc.\n\n\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Kim, Page 1 In this chapter, we describe some basics of ring theory. Rings are mathematical structures which generalize the structures of the familiar integers, rational numbers, real numbers, complex numberes, etc.\n NOT #_meta/concept NOT #_meta/exercise NOT #_meta/definition NOT #_meta/example IS #_meta/narrative ... NOT #_meta/proof NOT #_meta/remark NOT #_meta/TODO/split NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context
1 2023-04-30T19:05 2023-04-30T19:05 reference_with_tag_labels_Definition 1 ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/definition]\n---\n# Ring[^1]\n\nA **ring** is a set with binary operators $+$ and $\cdot$ such that ...\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Kim, Definition 1 A ring is a set with binary operators $+$ and $\cdot$ such that ...\n NOT #_meta/concept NOT #_meta/exercise IS #_meta/definition NOT #_meta/example NOT #_meta/narrative ... NOT #_meta/proof NOT #_meta/remark NOT #_meta/TODO/split NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context
2 2023-04-30T19:05 2023-04-30T19:05 reference_with_tag_labels_Definition 2 ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/definition, _meta/notation]\n---\n# Ring of integers modulo $n$[^1]\n\nLet $n \geq 1$ be an integer. The **ring of integers modulo $n$**, denoted by **$\mathbb{Z}/n\mathbb{Z}$**, is, informally, the ring whose elements are represented by the integers with the understanding that $0$ and $n$ are equal.\n\nMore precisely, $\mathbb{Z}/n\mathbb{Z}$ has the elements $0,1,\ldots,n-1$.\n\n...\n\n\n# See Also\n- [[reference_with_tag_labels_Exercise 1|reference_with_tag_labels_Z_nZ_is_a_ring]]\n# Meta\n## References\n\n## ... Let $n \geq 1$ be an integer. The ring of integers modulo $n$, denoted by $\mathbb{Z}/n\mathbb{Z}$, is, informally, the ring whose elements are represented by the integers with the understanding that $0$ and $n$ are equal.\n\nMore precisely, $\mathbb{Z}/n\mathbb{Z}$ has the elements $0,1,\ldots,n-1$.\n\n...\n NOT #_meta/concept NOT #_meta/exercise IS #_meta/definition NOT #_meta/example NOT #_meta/narrative ... NOT #_meta/proof NOT #_meta/remark NOT #_meta/TODO/split NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context
3 2023-04-30T19:05 2023-04-30T19:05 reference_with_tag_labels_Exercise 1 ---\ncssclass: clean-embeds\naliases: [reference_with_tag_labels_Z_nZ_is_a_ring]\ntags: [_meta/literature_note, _meta/exercise]\n---\n# $\mathbb{Z}/n\mathbb{Z}$ is a ring[^1]\n\nShow that $\mathbb{Z}/n\mathbb{Z}$ is a ring.\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Exercise 1 Show that $\mathbb{Z}/n\mathbb{Z}$ is a ring.\n NOT #_meta/concept IS #_meta/exercise NOT #_meta/definition NOT #_meta/example NOT #_meta/narrative ... NOT #_meta/proof NOT #_meta/remark NOT #_meta/TODO/split NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context
4 2023-04-30T19:05 2023-04-30T19:05 reference_with_tag_labels_Theorem 1 ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/concept, _meta/proof]\n---\n# The polynomial ring of a UFD is a UFD[^1]\n\nTheorem 1. Let $R$ be a UFD. Then $R[x]$ is a UFD.\n\nProof. Let $f,g \in R[x]$ and suppose that $fg = 0$. Write $f = \sum_{i=0}^n a_i x^i$ and $g = \sum_{j=0}^m b_j x^j$ for some $a_i,b_j \in R$.\n\n...\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Kim, Theorem 1 Theorem 1. Let $R$ be a UFD. Then $R[x]$ is a UFD.\n\nProof. Let $f,g \in R[x]$ and suppose that $fg = 0$. Write $f = \sum_{i=0}^n a_i x^i$ and $g = \sum_{j=0}^m b_j x^j$ for some $a_i,b_j \in R$.\n\n...\n IS #_meta/concept NOT #_meta/exercise NOT #_meta/definition NOT #_meta/example NOT #_meta/narrative ... IS #_meta/proof NOT #_meta/remark NOT #_meta/TODO/split NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context

5 rows × 21 columns


source

append_to_information_note_type_database

 append_to_information_note_type_database (vault:os.PathLike,
                                           file:os.PathLike, notes:list[tr
                                           ouver.markdown.obsidian.vault.V
                                           aultNote], backup:bool=True)

Either create a csv file containing data for information note type labels or append to an existing csv file.

The columns of the database file are as follows:

  • Time added - The time when the row was added.
  • Time modified - The time when the labels of the row
  • Note name - The name of the note from which the data for the row was derived.
  • Full note content - The entire content/text of the note.
  • Processed note content - The “raw” content of the note without the YAML frontmatter meta, Markdown headings, links, footnotes, etc.

All timestamps are in UTC time and specify time to minutes (i.e. no seconds/microseconds).

If a “new” note has the same processed content as a pre-existing note and anything is different about the “new” note, then update the row of the existing note. In particular, the following are updated: - Time modified (set to current time) - Note name (overwritten) - Full note content (overwritten) - Columns for categorization (overwritten)

This method assumes that all the processed content in the CSV file are all distinct if the CSV file exists.

Type Default Details
vault PathLike The vault freom which the data is drawn
file PathLike The path to a CSV file
notes list the notes to add to the database
backup bool True If True, makes a copy of file in the same directory and with the same name, except with an added extension of .bak.
Returns None
with tempfile.TemporaryDirectory(prefix='temp_dir', dir=os.getcwd()) as temp_dir:
    temp_vault = Path(temp_dir) / 'test_vault_6'
    shutil.copytree(_test_directory() / 'test_vault_6', temp_vault)

    index_note = VaultNote(temp_vault, name='_index_1_introduction_reference_with_tag_labels')
    notes = notes_linked_in_note(index_note, as_dict=False)
    file = temp_vault / '_ml_data' / 'information_note_type_labels.csv'
    append_to_information_note_type_database(
         temp_vault, file, notes)

    # Uncomment these lines to see `temp_vault` and its contents.
#     os.startfile(os.getcwd())
#     input()
    df = pd.read_csv(file)
    print(df.head())
         Time added     Time modified  \
0  2023-04-30T19:05  2023-04-30T19:05   
1  2023-04-30T19:05  2023-04-30T19:05   
2  2023-04-30T19:05  2023-04-30T19:05   
3  2023-04-30T19:05  2023-04-30T19:05   
4  2023-04-30T19:05  2023-04-30T19:05   

                                       Note name  \
0  reference_with_tag_labels_something_something   
1         reference_with_tag_labels_Definition 1   
2         reference_with_tag_labels_Definition 2   
3           reference_with_tag_labels_Exercise 1   
4            reference_with_tag_labels_Theorem 1   

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         Full note content  \
0                                                                                                                                                                                                    ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/narrative]\n---\n# Topic[^1]\n\nIn this chapter, we describe some basics of ring theory. Rings are mathematical structures which generalize the structures of the familiar integers, rational numbers, real numbers, complex numberes, etc.\n\n\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Kim, Page 1   
1                                                                                                                                                                                                                                                                                                                                      ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/definition]\n---\n# Ring[^1]\n\nA **ring** is a set with binary operators $+$ and $\cdot$ such that ...\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Kim, Definition 1   
2  ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/definition, _meta/notation]\n---\n# Ring of integers modulo $n$[^1]\n\nLet $n \geq 1$ be an integer. The **ring of integers modulo $n$**, denoted by **$\mathbb{Z}/n\mathbb{Z}$**, is, informally, the ring whose elements are represented by the integers with the understanding that $0$ and $n$ are equal.\n\nMore precisely, $\mathbb{Z}/n\mathbb{Z}$ has the elements $0,1,\ldots,n-1$.\n\n...\n\n\n# See Also\n- [[reference_with_tag_labels_Exercise 1|reference_with_tag_labels_Z_nZ_is_a_ring]]\n# Meta\n## References\n\n## ...   
3                                                                                                                                                                                                                                                                                                   ---\ncssclass: clean-embeds\naliases: [reference_with_tag_labels_Z_nZ_is_a_ring]\ntags: [_meta/literature_note, _meta/exercise]\n---\n# $\mathbb{Z}/n\mathbb{Z}$ is a ring[^1]\n\nShow that $\mathbb{Z}/n\mathbb{Z}$ is a ring.\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Exercise 1   
4                                                                                                                                                          ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/concept, _meta/proof]\n---\n# The polynomial ring of a UFD is a UFD[^1]\n\nTheorem 1. Let $R$ be a UFD. Then $R[x]$ is a UFD.\n\nProof. Let $f,g \in R[x]$ and suppose that $fg = 0$. Write $f = \sum_{i=0}^n a_i x^i$ and $g = \sum_{j=0}^m b_j x^j$ for some $a_i,b_j \in R$.\n\n...\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Kim, Theorem 1   

                                                                                                                                                                                                                                                                                                   Processed note content  \
0                                                                                                           In this chapter, we describe some basics of ring theory. Rings are mathematical structures which generalize the structures of the familiar integers, rational numbers, real numbers, complex numberes, etc.\n   
1                                                                                                                                                                                                                                                   A ring is a set with binary operators $+$ and $\cdot$ such that ...\n   
2  Let $n \geq 1$ be an integer. The ring of integers modulo $n$, denoted by $\mathbb{Z}/n\mathbb{Z}$, is, informally, the ring whose elements are represented by the integers with the understanding that $0$ and $n$ are equal.\n\nMore precisely, $\mathbb{Z}/n\mathbb{Z}$ has the elements $0,1,\ldots,n-1$.\n\n...\n   
3                                                                                                                                                                                                                                                                         Show that $\mathbb{Z}/n\mathbb{Z}$ is a ring.\n   
4                                                                                                           Theorem 1. Let $R$ be a UFD. Then $R[x]$ is a UFD.\n\nProof. Let $f,g \in R[x]$ and suppose that $fg = 0$. Write $f = \sum_{i=0}^n a_i x^i$ and $g = \sum_{j=0}^m b_j x^j$ for some $a_i,b_j \in R$.\n\n...\n   

       #_meta/concept      #_meta/exercise      #_meta/definition  \
0  NOT #_meta/concept  NOT #_meta/exercise  NOT #_meta/definition   
1  NOT #_meta/concept  NOT #_meta/exercise   IS #_meta/definition   
2  NOT #_meta/concept  NOT #_meta/exercise   IS #_meta/definition   
3  NOT #_meta/concept   IS #_meta/exercise  NOT #_meta/definition   
4   IS #_meta/concept  NOT #_meta/exercise  NOT #_meta/definition   

       #_meta/example      #_meta/narrative  ...      #_meta/proof  \
0  NOT #_meta/example   IS #_meta/narrative  ...  NOT #_meta/proof   
1  NOT #_meta/example  NOT #_meta/narrative  ...  NOT #_meta/proof   
2  NOT #_meta/example  NOT #_meta/narrative  ...  NOT #_meta/proof   
3  NOT #_meta/example  NOT #_meta/narrative  ...  NOT #_meta/proof   
4  NOT #_meta/example  NOT #_meta/narrative  ...   IS #_meta/proof   

       #_meta/remark      #_meta/TODO/split      #_meta/TODO/merge  \
0  NOT #_meta/remark  NOT #_meta/TODO/split  NOT #_meta/TODO/merge   
1  NOT #_meta/remark  NOT #_meta/TODO/split  NOT #_meta/TODO/merge   
2  NOT #_meta/remark  NOT #_meta/TODO/split  NOT #_meta/TODO/merge   
3  NOT #_meta/remark  NOT #_meta/TODO/split  NOT #_meta/TODO/merge   
4  NOT #_meta/remark  NOT #_meta/TODO/split  NOT #_meta/TODO/merge   

       #_meta/TODO/delete      #_meta/hint      #_meta/how_to  \
0  NOT #_meta/TODO/delete  NOT #_meta/hint  NOT #_meta/how_to   
1  NOT #_meta/TODO/delete  NOT #_meta/hint  NOT #_meta/how_to   
2  NOT #_meta/TODO/delete  NOT #_meta/hint  NOT #_meta/how_to   
3  NOT #_meta/TODO/delete  NOT #_meta/hint  NOT #_meta/how_to   
4  NOT #_meta/TODO/delete  NOT #_meta/hint  NOT #_meta/how_to   

       #_meta/conjecture      #_meta/convention      #_meta/context  
0  NOT #_meta/conjecture  NOT #_meta/convention  NOT #_meta/context  
1  NOT #_meta/conjecture  NOT #_meta/convention  NOT #_meta/context  
2  NOT #_meta/conjecture  NOT #_meta/convention  NOT #_meta/context  
3  NOT #_meta/conjecture  NOT #_meta/convention  NOT #_meta/context  
4  NOT #_meta/conjecture  NOT #_meta/convention  NOT #_meta/context  

[5 rows x 21 columns]

Use the trained model to predict note types

After training the model (cf. how_to.train_ml_model.fastai), we can now predict the types of notes


source

predict_text_types

 predict_text_types (learn:fastai.text.learner.TextLearner,
                     texts:list[str], remove_NO_TAG:bool=True)

Predict the types of mathematical texts using an ML model.

Type Default Details
learn TextLearner The ML model predicting note types.
texts list
remove_NO_TAG bool True If True, remove NO_TAG, which in theory is supposed to indicate that no types are predicted, but in practice can somehow be predicted along with actual types.
Returns list Each list corresponds to a text and contains the predicted types of the text.

We can predict types of short mathematical texts. Say that the information note type classification model, trained in how_to.train_ml_model.fastai is loaded, e.g. via fastai’s load_learner function:

model = load_learner(<path_to_model>)
texts_to_predict = [
    '',
    'In this chapter, we introduce the notion of rings, some related notions, and many examples.',
    'A ring is a set equipped with two binary operators $+$ and $\cdot$ such that...',
    'Theorem. For every prime power $q$, there is, up to isomorphism, exactly one field with $q$ elements.\n\nProof. Let $q = p^k$ where $p$ is a prime. Let $F$ be a field with $q$ elements. Note that...',
    'Remark. Note that $\mathbb{F}_q$ and $\mathbb{Z}/q\mathbb{Z}$ are different rings',
    'As an example, take $\mathbb{F}_9$. It can be presented as $\mathbb{F}_3[x^2+1]$ as well as $\mathbb{F}_3[x^2+x+2]$.'
]
sample_outputs = predict_text_types(
    model, texts_to_predict)

print(sample_outputs)
[['#_meta/TODO/delete', '#_meta/TODO/split'], ['#_meta/narrative'], ['#_meta/definition'], ['#_meta/concept', '#_meta/proof'], ['#_meta/remark'], ['#_meta/example']]

source

predict_note_types

 predict_note_types (learn:fastai.text.learner.TextLearner,
                     vault:os.PathLike, notes:list[trouver.markdown.obsidi
                     an.vault.VaultNote], remove_NO_TAG:bool=True)
Type Default Details
learn TextLearner The ML model predicting note types.
vault PathLike The vault with the notes.
notes list The notes with texts to predict
remove_NO_TAG bool True If True, remove NO_TAG, which in theory is supposed to indicate that no types are predicted, but in practice can somehow be predicted along with actual types.
Returns list
# TODO: tests
# predict_note_types

test_vault = _test_directory() / 'test_vault_6'
notes_to_predict = [
    VaultNote(test_vault, name='reference_with_tag_labels_Theorem 1'),
    VaultNote(test_vault, name='reference_with_tag_labels_Definition 1')
]
sample_outputs = predict_note_types(model, test_vault, notes_to_predict)

print(
    f'The following are the raw content of the notes without'
    f'metadata along with the model\'s predictions for their types:\n\n')
for note, prediction in zip(notes_to_predict, sample_outputs):
    print(process_standard_information_note(MarkdownFile.from_vault_note(note), test_vault) )
    print(prediction, '\n\n')
The following are the raw content of the notes withoutmetadata along with the model's predictions for their types:


Theorem 1. Let $R$ be a UFD. Then $R[x]$ is a UFD.

Proof. Let $f,g \in R[x]$ and suppose that $fg = 0$. Write $f = \sum_{i=0}^n a_i x^i$ and $g = \sum_{j=0}^m b_j x^j$ for some $a_i,b_j \in R$.

...

['#_meta/concept', '#_meta/proof'] 


A ring is a set with binary operators $+$ and $\cdot$ such that ...

['#_meta/definition'] 

with (mock.patch('__main__.MarkdownFile.from_vault_note') as mock_markdownfile_from_vault_note,
          mock.patch('__main__.process_standard_information_note') as mock_process_standard_information_note,
          mock.patch('__main__.load_learner') as mock_load_learner):
      mock_path, mock_vault = None, None
      mock_notes = [None, None, None, None, None]
      mock_learner = load_learner(mock_path)
      mock_learner.predict.side_effect = [
          (['#_meta/definition'], tensor([False, False, False, False, False, False,  True, False, False, False,
            False, False, False, False]), tensor([4.5286e-03, 4.3938e-04, 1.3009e-02, 2.3737e-02, 6.1815e-06, 8.0106e-06,
            9.7913e-01, 1.8286e-02, 1.4456e-03, 1.7337e-02, 3.4138e-01, 1.3493e-03,
            8.6190e-05, 6.3249e-03])),
          (['#_meta/exercise'], tensor([False, False, False, False, False, False, False, False,  True, False,
            False, False, False, False]), tensor([1.1748e-03, 1.6718e-04, 1.3828e-02, 9.7984e-02, 2.7409e-06, 1.1164e-06,
            4.8720e-02, 2.7309e-03, 9.9905e-01, 1.3665e-02, 1.7930e-02, 9.9452e-04,
            2.8210e-04, 2.1275e-02])),
          (['#_meta/TODO/delete', 'NO_TAG'], tensor([ True, False, False, False, False, False, False, False, False, False,
            False, False, False,  True]), tensor([6.1455e-01, 1.6902e-01, 1.4254e-02, 5.1358e-02, 4.8857e-05, 3.0853e-04,
            6.0457e-03, 1.2064e-04, 8.5651e-02, 1.3941e-02, 5.2413e-04, 1.3709e-01,
            9.7726e-06, 9.7614e-01])),
          (['#_meta/concept', '#_meta/proof'], tensor([False, False, False,  True, False, False, False, False, False, False,
            False,  True, False, False]), tensor([4.0871e-03, 3.6683e-04, 1.6594e-01, 9.7876e-01, 6.0281e-05, 8.7817e-06,
            1.9275e-02, 1.5589e-03, 4.5301e-03, 7.8989e-03, 1.2528e-02, 9.2800e-01,
            9.4636e-04, 1.4658e-02])),
          (['NO_TAG'], tensor([ True, False, False, False, False, False, False, False, False, False,
            False, False, False,  True]), tensor([6.1455e-02, 1.6902e-01, 1.4254e-02, 5.1358e-02, 4.8857e-05, 3.0853e-04,
            6.0457e-03, 1.2064e-04, 8.5651e-02, 1.3941e-02, 5.2413e-04, 1.3709e-01,
            9.7726e-06, 9.7614e-01])),
      ]
      test_eq(predict_note_types(mock_learner, mock_vault, mock_notes),
        [['#_meta/definition'],
         ['#_meta/exercise'],
         ['#_meta/TODO/delete'],
         ['#_meta/concept', '#_meta/proof'],
         []])

      mock_notes = [None, None]
      mock_learner.predict.side_effect = [
          (['#_meta/TODO/delete', 'NO_TAG'], tensor([ True, False, False, False, False, False, False, False, False, False,
            False, False, False,  True]), tensor([6.1455e-01, 1.6902e-01, 1.4254e-02, 5.1358e-02, 4.8857e-05, 3.0853e-04,
            6.0457e-03, 1.2064e-04, 8.5651e-02, 1.3941e-02, 5.2413e-04, 1.3709e-01,
            9.7726e-06, 9.7614e-01])),
          (['NO_TAG'], tensor([ True, False, False, False, False, False, False, False, False, False,
            False, False, False,  True]), tensor([6.1455e-02, 1.6902e-01, 1.4254e-02, 5.1358e-02, 4.8857e-05, 3.0853e-04,
            6.0457e-03, 1.2064e-04, 8.5651e-02, 1.3941e-02, 5.2413e-04, 1.3709e-01,
            9.7726e-06, 9.7614e-01])),
      ]
      test_eq(predict_note_types(mock_learner, mock_vault, mock_notes, remove_NO_TAG=False),
        [['#_meta/TODO/delete', 'NO_TAG'],
         ['NO_TAG']])

source

automatically_add_note_type_tags

 automatically_add_note_type_tags (learn:fastai.text.learner.TextLearner,
                                   vault:os.PathLike, notes:list[trouver.m
                                   arkdown.obsidian.vault.VaultNote],
                                   add_auto_label:bool=True,
                                   overwrite:Optional[str]=None)

Predict note types and add the predicted types as frontmatter YAML tags in the notes.

Non-_auto-labeled tags take precedent over auto-labeled tags, unless overwrite='w'.

Raises

  • Warning:
    • If overwrite=None, a note already has some note type tags, and learn predicts different note types as those in the note.
Type Default Details
learn TextLearner The ML model predicting note types.
vault PathLike The vault with the notes
notes list
add_auto_label bool True If True, adds "_auto" to the front of the note type tag to indicate that the tags were added via this automated script.
overwrite typing.Optional[str] None Either 'w', 'ws', 'ww', 'a', or None. If 'w' or 'ws', then overwrite any already-existing note type tags (from LABEL_TAGS), whether or not these tags are _auto tags, with the predicted tags. IF 'ww', then overwrite only the _auto tags among the already-existing note type tags with the predicted tags. If 'a', then preserve already-existing note type tags and just append the newly predicted ones; in the case that learn predicts the note type whose tag is already in the note, a new tag of that type is not added, even if add_auto_label=True. If None, then do not make modifications to each note if any note type tags already exist in the note; if the predicted note types are different from the already existing note types, then raise a warning.
Returns None

In the below examples, we use mock.patch to test adding note types without testing the ML model itself. In particular, we pretend as though the ML model returns certain predictions (technically, we pretend as though predict_note_types return certain values) to construct these examples.

The following example demonstrates a basic use case of adding predicted note type tags to notes without any note type tags:

# Here we just test adding a note type without testing the ML model itself.
with (tempfile.TemporaryDirectory(prefix='temp_dir', dir=os.getcwd()) as temp_dir,
      mock.patch('__main__.predict_note_types') as mock_predict_note_types):
    temp_vault = Path(temp_dir) / 'test_vault_6'
    shutil.copytree(_test_directory() / 'test_vault_6', temp_vault)

    # Example where `add_auto_label` is `True`
    mock_predict_note_types.return_value = [['#_meta/definition'], ['#_meta/concept', '#_meta/proof']]
    vn1 = VaultNote(temp_vault, name='reference_without_tag_labels_Definition 1')
    vn2 = VaultNote(temp_vault, name='reference_without_tag_labels_Theorem 1')
    notes = [vn1, vn2]
    mock_learn = None
    automatically_add_note_type_tags(mock_learn, temp_vault, notes)

    # Note that _auto/_meta/definition has been added
    print(vn1.text())
    assert (MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/definition'))
    assert (MarkdownFile.from_vault_note(vn2).has_tag('_auto/_meta/concept'))
    assert (MarkdownFile.from_vault_note(vn2).has_tag('_auto/_meta/proof'))


    # Examle where `add_auto_label` is `False`
    mock_predict_note_types.return_value = [['#_meta/definition', '#_meta/notation']]
    notes = [VaultNote(temp_vault, name='reference_without_tag_labels_Definition 2')]
    automatically_add_note_type_tags(mock_learn, temp_vault, notes, add_auto_label=False)
    assert (MarkdownFile.from_vault_note(notes[0]).has_tag('_meta/definition'))
    assert (MarkdownFile.from_vault_note(notes[0]).has_tag('_meta/notation'))
---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _auto/_meta/definition]
---
# Ring[^1]

A **ring** is a set with binary operators $+$ and $\cdot$ such that ...

# See Also

# Meta
## References

## Citations and Footnotes
[^1]: Kim, Definition 1

In the following example, overwrite is set to 'w' (or 'ws'), so all preexisting note type tags are removed before the predicted ones are added:

with (tempfile.TemporaryDirectory(prefix='temp_dir', dir=os.getcwd()) as temp_dir,
      mock.patch('__main__.predict_note_types') as mock_predict_note_types):
    temp_vault = Path(temp_dir) / 'test_vault_6'
    shutil.copytree(_test_directory() / 'test_vault_6', temp_vault)

    mock_predict_note_types.return_value = [['#_meta/definition']]
    vn1 = VaultNote(temp_vault, name='reference_with_tag_labels_Definition 1')
    notes = [vn1]
    mock_learn = None
    automatically_add_note_type_tags(mock_learn, temp_vault, notes, overwrite='w')

    # Note that _meta/definition has been removed and _auto/_meta/definition has been added.
    print(vn1.text())
    assert MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/definition')
    assert not MarkdownFile.from_vault_note(vn1).has_tag('_meta/definition')
---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _auto/_meta/definition]
---
# Ring[^1]

A **ring** is a set with binary operators $+$ and $\cdot$ such that ...

# See Also

# Meta
## References

## Citations and Footnotes
[^1]: Kim, Definition 1

In the following example, overwrite is set to 'ww', so only the _auto note type tags are removed and the predicted ones are added:

# TODO: change example
with (tempfile.TemporaryDirectory(prefix='temp_dir', dir=os.getcwd()) as temp_dir,
      mock.patch('__main__.predict_note_types') as mock_predict_note_types):
    temp_vault = Path(temp_dir) / 'test_vault_6'
    shutil.copytree(_test_directory() / 'test_vault_6', temp_vault)

    mock_predict_note_types.return_value = [['#_meta/definition']]
    vn1 = VaultNote(temp_vault, name='reference_with_tag_labels_Definition 1')
    notes = [vn1]
    mock_learn = None
    automatically_add_note_type_tags(mock_learn, temp_vault, notes, overwrite='w')

    # Note that _meta/definition has been removed and _auto/_meta/definition has been added.
    print(vn1.text())
    assert MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/definition')
    assert not MarkdownFile.from_vault_note(vn1).has_tag('_meta/definition')
---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _auto/_meta/definition]
---
# Ring[^1]

A **ring** is a set with binary operators $+$ and $\cdot$ such that ...

# See Also

# Meta
## References

## Citations and Footnotes
[^1]: Kim, Definition 1

In the following example, overwrite is set to 'a', so only newly predicted note type tags are added

with (tempfile.TemporaryDirectory(prefix='temp_dir', dir=os.getcwd()) as temp_dir,
      mock.patch('__main__.predict_note_types') as mock_predict_note_types):
    temp_vault = Path(temp_dir) / 'test_vault_6'
    shutil.copytree(_test_directory() / 'test_vault_6', temp_vault)

    mock_predict_note_types.return_value = [['#_meta/definition', '#_meta/notation', '#_meta/concept', '#_meta/proof']]
    vn1 = VaultNote(temp_vault, name='reference_with_tag_labels_Theorem 2')
    notes = [vn1]
    mock_learn = None
    automatically_add_note_type_tags(mock_learn, temp_vault, notes, overwrite='a')

    # Example with `add_auto_label=True`
    # Note that _auto/_meta/notation has been added, but _meta/definition, _meta/concept,
    # and #_auto/_meta/proof remain unchanged. Moreover, _auto/_meta/definition and _auto/_meta/cocnept are
    # NOT added.
    print(vn1.text())
    assert MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/notation')
    assert MarkdownFile.from_vault_note(vn1).has_tag('_meta/definition')
    assert MarkdownFile.from_vault_note(vn1).has_tag('_meta/concept')
    assert MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/proof')
    assert not MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/definition')
    assert not MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/concept')


with (tempfile.TemporaryDirectory(prefix='temp_dir', dir=os.getcwd()) as temp_dir,
      mock.patch('__main__.predict_note_types') as mock_predict_note_types):
    temp_vault = Path(temp_dir) / 'test_vault_6'
    shutil.copytree(_test_directory() / 'test_vault_6', temp_vault)

    mock_predict_note_types.return_value = [['#_meta/definition', '#_meta/notation', '#_meta/concept', '#_meta/proof']]
    vn1 = VaultNote(temp_vault, name='reference_with_tag_labels_Theorem 2')
    notes = [vn1]
    mock_learn = None
    automatically_add_note_type_tags(mock_learn, temp_vault, notes, overwrite='a', add_auto_label=False)

    # Example with `add_auto_label=False`
    # Note that _meta/notation has been added, and _auto/_meta/proof is replaced
    # with _meta/proof, but _meta/definition and _meta/concept remain unchanged.
    print(vn1.text())
    assert MarkdownFile.from_vault_note(vn1).has_tag('_meta/notation')
    assert MarkdownFile.from_vault_note(vn1).has_tag('_meta/proof')
    assert not MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/proof')
    assert MarkdownFile.from_vault_note(vn1).has_tag('_meta/definition')
    assert MarkdownFile.from_vault_note(vn1).has_tag('_meta/concept')
---
cssclass: clean-embeds
aliases: []
tags: [_meta/concept, _auto/_meta/proof, _meta/definition, _auto/_meta/notation, _meta/literature_note]
---
%%Note that this note introduces a notation and hence actually ought to be labeled with the tag _meta/notation as well; but for the sake of example, the job of adding the _meta/notation tag will be left to the `automatically_add_note_type_tags` function.%%

# The polynomial ring of a UFD is a UFD[^1]
Let $q$ be the power of a prime number. Up to isomorphism, there is a unique field with $q$ elements. This field is denoted **$\mathbb{F}_q$** and is called the **finite field of $q$ elements**.

Proof. Say that $q = p^k$ and let $F$ be a field with $q$ elements. First note that $F$ has a subfield "generated by $1$", i.e. the elements $0,1,\ldots,p-1$ form a subfield of $F$.

# See Also

# Meta
## References

## Citations and Footnotes
[^1]: Kim, Theorem 2
---
cssclass: clean-embeds
aliases: []
tags: [_meta/concept, _meta/definition, _meta/literature_note, _meta/notation, _meta/proof]
---
%%Note that this note introduces a notation and hence actually ought to be labeled with the tag _meta/notation as well; but for the sake of example, the job of adding the _meta/notation tag will be left to the `automatically_add_note_type_tags` function.%%

# The polynomial ring of a UFD is a UFD[^1]
Let $q$ be the power of a prime number. Up to isomorphism, there is a unique field with $q$ elements. This field is denoted **$\mathbb{F}_q$** and is called the **finite field of $q$ elements**.

Proof. Say that $q = p^k$ and let $F$ be a field with $q$ elements. First note that $F$ has a subfield "generated by $1$", i.e. the elements $0,1,\ldots,p-1$ form a subfield of $F$.

# See Also

# Meta
## References

## Citations and Footnotes
[^1]: Kim, Theorem 2

In the following example, overwrite is set to None. The notes are not modified, but if the note type tags in a note do not match the predicted ones, then a warning is raised

# TODO: add example

Convert _auto/ tags to regular tags

After checking that the automatically predicted tags are correct, we can convert them to regular tags.


source

convert_auto_tags_to_regular_tags_in_notes

 convert_auto_tags_to_regular_tags_in_notes
                                             (notes:list[trouver.markdown.
                                             obsidian.vault.VaultNote], ex
                                             clude:list[str]=['links_added
                                             ', 'notations_added'])

Convert the auto tags into regular tags for the notes.

Type Default Details
notes list
exclude list [‘links_added’, ‘notations_added’] The tags whose _auto/ tags should not be converted. The str should not start with '#' and should not start with '_auto/'.
Returns None
with (tempfile.TemporaryDirectory(prefix='temp_dir', dir=os.getcwd()) as temp_dir):
    temp_vault = Path(temp_dir) / 'test_vault_6'
    shutil.copytree(_test_directory() / 'test_vault_6', temp_vault)

    vn = VaultNote(temp_vault, name='reference_with_tag_labels_Theorem 2')
    convert_auto_tags_to_regular_tags_in_notes([vn])
    print(vn.text())
    mf = MarkdownFile.from_vault_note(vn)
    assert mf.has_tag('_meta/proof')
    assert not mf.has_tag('_auto/_meta/proof')
---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _meta/concept, _meta/proof, _meta/definition]
---
%%Note that this note introduces a notation and hence actually ought to be labeled with the tag _meta/notation as well; but for the sake of example, the job of adding the _meta/notation tag will be left to the `automatically_add_note_type_tags` function.%%

# The polynomial ring of a UFD is a UFD[^1]
Let $q$ be the power of a prime number. Up to isomorphism, there is a unique field with $q$ elements. This field is denoted **$\mathbb{F}_q$** and is called the **finite field of $q$ elements**.

Proof. Say that $q = p^k$ and let $F$ be a field with $q$ elements. First note that $F$ has a subfield "generated by $1$", i.e. the elements $0,1,\ldots,p-1$ form a subfield of $F$.

# See Also

# Meta
## References

## Citations and Footnotes
[^1]: Kim, Theorem 2

Footnotes

  1. Note that the tag in the YAML frontmatter meta is notated as _meta/definition, which lacks the starting hashtag #.↩︎