markdown.obsidian.personal.machine_learning.information_note_types

Functions for gathering machine learning data on the types of math information notes from tags and for using ML models trained on such data to predict typeso of math information notes.

Some common types of components in mathematical writing include: definitions, notations, concepts (e.g. theorems, propositions, corollaries, lemmas), proofs. The functions in this module gather data from labeled “standard information notes” (formatted in trouver’s standard formatting) in an Obsidian.md vault about the types of these notes. Such data can be used to train a categorization ML model to predict types of unlabeled notes.

The labels are done by Markdown tags in the notes’ YAML frontmatter meta (so tags in the body of the Markdown file, are ignored). For example, the note

---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _meta/definition]
---
# This is a title of a note[^1]

We could talk about many things. I like to talk about rings!

A **ring** is a set equipped with two binary operators $+$ and $\cdot$
such that...

# See Also

# Meta
## References
![[_reference_sample_reference]]

## Citations and Footnotes
[^1]: Author names, Some way to identify where this information comes from

has the tag #_meta/definition 1

LABEL_TAGS
['#_meta/concept',
 '#_meta/exercise',
 '#_meta/definition',
 '#_meta/example',
 '#_meta/narrative',
 '#_meta/notation',
 '#_meta/proof',
 '#_meta/remark',
 '#_meta/TODO/split',
 '#_meta/TODO/merge',
 '#_meta/TODO/delete',
 '#_meta/hint',
 '#_meta/how_to',
 '#_meta/conjecture',
 '#_meta/convention',
 '#_meta/context',
 '#_meta/permanent_note',
 '#_meta/question',
 '#_meta/problem']

LABEL_TAGS above lists the tags for the note types that we would like to eventually train a model to predict. The following are the tags for which the author of trouver has ample labeled data:

note that the author of trouver has only trained a model that predicts some of the note types listed in LABEL_TAGS. Moreover, the accuracy of the predictions can widely depend amongst the different types.

It is often appropriate to label a single note with more than one of these tags. For example, a note containing the statement “We define the ring \(\mathbb{Z}/n\mathbb{Z}\) of integers modulo \(n\)” is both a definition note and a notation note because it both introduces notion of the ring of integers modulo \(n\) and gives notation for the ring.

from fastai.text.learner import TextLearner
from fastai.learner import load_learner
import pathlib
from pathlib import WindowsPath
import platform
import shutil
import tempfile
from unittest import mock

from fastcore.test import *
from fastcore.test import all_equal
from torch import tensor

from trouver.helper.tests import _test_directory
from trouver.markdown.obsidian.personal.notes import notes_linked_in_note

Gather and label data


source

note_is_labeled_with_tag

 note_is_labeled_with_tag (note:trouver.markdown.obsidian.vault.VaultNote,
                           label_tag:str, count_auto_tags:bool=False)

*Return True if the standard information note is labeled as begin a specified type.

Raises

  • ValueError
    • If label_tag does not include the beginning hashtag #.*
Type Default Details
note VaultNote
label_tag str A tag which labels a type that note is. Includes the beginning hashtag #, e.g. #_meta/definition, #_meta/TODO/split
count_auto_tags bool False If True, count #_auto/_meta/<tag> notes as #_meta/<tag> for the purposes of the data collection.
Returns bool True if note is labeled as type label_type.
sample_text = r"""---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _meta/definition]
---
# This is a title of a note[^1]

We could talk about many things. I like to talk about rings!

A **ring** is a set equipped with two binary operators $+$ and $\cdot$
such that...

# See Also

# Meta
## References
![[_reference_sample_reference]]

## Citations and Footnotes
[^1]: Author names, Some way to identify where this information comes from
"""
sample_mf = MarkdownFile.from_string(sample_text)

with mock.patch("__main__.MarkdownFile.from_vault_note", return_value=sample_mf) as mock_markdownfile_from_vault_note:
    mock_note = None
    # This is setup in such a way that the invocation to
    # `note_is_labeled_with_tag` will use
    # a note whose text is `sample_text`.
    assert note_is_labeled_with_tag(mock_note, '#_meta/definition')
    assert not note_is_labeled_with_tag(mock_note, '#_meta/notation')
    assert not note_is_labeled_with_tag(mock_note, '#_meta/concept')

    with ExceptionExpected(ValueError):
        # The argument to `label_tag` requires the starting hashtag `#`.`
        note_is_labeled_with_tag(mock_note, '_meta/definition')

Setting count_auto_tags to True allows for _auto tags (those tags labeled by an ML model) to count.

sample_text = r"""---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _auto/_meta/definition]
---
# This is a title of a note[^1]

# See Also

# Meta
## References
![[_reference_sample_reference]]

## Citations and Footnotes
[^1]: Author names, Some way to identify where this information comes from
"""
sample_mf = MarkdownFile.from_string(sample_text)

with mock.patch("__main__.MarkdownFile.from_vault_note", return_value=sample_mf) as mock_markdownfile_from_vault_note:
    mock_note = None
    # This is setup in such a way that the invocation to
    # `note_is_labeled_with_tag` will use
    # a note whose text is `sample_text`.
    assert note_is_labeled_with_tag(mock_note, '#_meta/definition', count_auto_tags=True)
    assert not note_is_labeled_with_tag(mock_note, '#_meta/notation')
    assert not note_is_labeled_with_tag(mock_note, '#_meta/concept')

    with ExceptionExpected(ValueError):
        # The argument to `label_tag` requires the starting hashtag `#`.`
        note_is_labeled_with_tag(mock_note, '_meta/definition')

# Test count_auto_tags=False

sample_text = r"""---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _auto/_meta/definition]
---
# This is a title of a note[^1]

# See Also

# Meta
## References
![[_reference_sample_reference]]

## Citations and Footnotes
[^1]: Author names, Some way to identify where this information comes from
"""
sample_mf = MarkdownFile.from_string(sample_text)

with mock.patch("__main__.MarkdownFile.from_vault_note", return_value=sample_mf) as mock_markdownfile_from_vault_note:
    mock_note = None
    # This is setup in such a way that the invocation to
    # `note_is_labeled_with_tag` will use
    # a note whose text is `sample_text`.
    assert not note_is_labeled_with_tag(mock_note, '#_meta/definition', count_auto_tags=False)
    assert not note_is_labeled_with_tag(mock_note, '#_meta/notation')
    assert not note_is_labeled_with_tag(mock_note, '#_meta/concept')

    with ExceptionExpected(ValueError):
        # The argument to `label_tag` requires the starting hashtag `#`.`
        note_is_labeled_with_tag(mock_note, '_meta/definition')

source

note_labels

 note_labels (note:trouver.markdown.obsidian.vault.VaultNote,
              count_auto_tags:bool=False)

*Return a dict indicating what labels a note has.

The labels come from the LABEL_TAGS dict.*

Type Default Details
note VaultNote
count_auto_tags bool False If True, count #_auto/_meta/<tag> notes as #_meta/<tag> for the purposes of the data collection.
Returns dict
sample_text = r"""---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _meta/definition]
---
# This is a title of a note[^1]

We could talk about many things. I like to talk about rings!

A **ring** is a set equipped with two binary operators $+$ and $\cdot$
such that...

# See Also

# Meta
## References
![[_reference_sample_reference]]

## Citations and Footnotes
[^1]: Author names, Some way to identify where this information comes from
"""
sample_mf = MarkdownFile.from_string(sample_text)

with mock.patch("__main__.MarkdownFile.from_vault_note", return_value=sample_mf) as mock_markdownfile_from_vault_note:
    mock_note = None
    # This is setup in such a way that the invocation to
    # `note_labels` will use
    # a note whose text is `sample_text`.
    sample_output = note_labels(mock_note)
    test_eq(sample_output['#_meta/definition'], 'IS #_meta/definition')
    test_eq(sample_output['#_meta/concept'], 'NOT #_meta/concept')
    for label_tag in LABEL_TAGS:
        assert label_tag in sample_output
    print(sample_output)
{'#_meta/concept': 'NOT #_meta/concept', '#_meta/exercise': 'NOT #_meta/exercise', '#_meta/definition': 'IS #_meta/definition', '#_meta/example': 'NOT #_meta/example', '#_meta/narrative': 'NOT #_meta/narrative', '#_meta/notation': 'NOT #_meta/notation', '#_meta/proof': 'NOT #_meta/proof', '#_meta/remark': 'NOT #_meta/remark', '#_meta/TODO/split': 'NOT #_meta/TODO/split', '#_meta/TODO/merge': 'NOT #_meta/TODO/merge', '#_meta/TODO/delete': 'NOT #_meta/TODO/delete', '#_meta/hint': 'NOT #_meta/hint', '#_meta/how_to': 'NOT #_meta/how_to', '#_meta/conjecture': 'NOT #_meta/conjecture', '#_meta/convention': 'NOT #_meta/convention', '#_meta/context': 'NOT #_meta/context', '#_meta/permanent_note': 'NOT #_meta/permanent_note', '#_meta/question': 'NOT #_meta/question', '#_meta/problem': 'NOT #_meta/problem'}

Gather data into a dataset

The way that data for information note types should be obtained is fairly simple - for each note, obtain the YAML frontmatter metadata.


source

labels_and_identifying_info_from_notes

 labels_and_identifying_info_from_notes (vault:os.PathLike,
                                         notes:list[trouver.markdown.obsid
                                         ian.vault.VaultNote],
                                         count_auto_tags:bool=False, raise
                                         _error_that_arises:bool=True)

*Return a list of dict’s, each corresponding to the data from an information note.

The keys in the dict’s are as follows

  • Time added - The time when the row was added.
  • Time modified - The time when the labels of the row
  • Note name - The name of the note from which the data for the row was derived.
  • Full note content - The entire content/text of the note.
  • Processed note content - The “raw” content of the note without the YAML frontmatter meta, Markdown headings, links, footnotes, etc.
  • The various labels for note types (e.g. #_meta/definition)

All timestamps are in UTC time and specify time to minutes (i.e. no seconds/microseconds).*

Type Default Details
vault PathLike The vault from which the notes come from; this is to invoke process_standard_information_note.
notes list Assumed to only contain standard information notes from which note type labels are to be gathered.
count_auto_tags bool False If True, count #_auto/_meta/<tag> notes as #_meta/<tag> for the purposes of the data collection.
raise_error_that_arises bool True If True, raise errors that arise as data is gathered from individual notes. Otherwise, print the would-be error message for individual notes, but do not include the data from the note.
Returns list Each dict has keys Time added, Time modified, Note name, Full note content, Processed note content as well as columns for each tag label.

source

information_note_types_as_dataset

 information_note_types_as_dataset (vault:os.PathLike,
                                    notes:list[trouver.markdown.obsidian.v
                                    ault.VaultNote],
                                    count_auto_tags:bool=False,
                                    raise_error_that_arises:bool=True)
Type Default Details
vault PathLike The vault from which the notes come from; this is to invoke process_standard_information_note.
notes list Assumed to only contain standard information notes from which note type labels are to be gathered.
count_auto_tags bool False If True, count #_auto/_meta/<tag> notes as #_meta/<tag> for the purposes of the data collection.
raise_error_that_arises bool True If True, raise errors that arise as data is gathered from individual notes. Otherwise, print the would-be error message for individual notes, but do not include the data from the note.
Returns Dataset

information_note_types_as_dataset gathers data for information notes types as a Dataset from the huggingface dataset library.

test_vault = _test_directory() / 'test_vault_6'
index_note = VaultNote(test_vault, name='_index_1_introduction_reference_with_tag_labels')
# There shoudl be 6 notes
notes = notes_linked_in_note(index_note, as_dict=False)
dataset = information_note_types_as_dataset(test_vault, notes)
dataset
# test_eq(len(df), 6)
# df.head()
c:\Users\hyunj\Documents\Development\Python\trouver_py310_venv\lib\site-packages\bs4\__init__.py:435: MarkupResemblesLocatorWarning: The input looks more like a filename than markup. You may want to open this file and pass the filehandle into Beautiful Soup.
  warnings.warn(
Dataset({
    features: ['Time added', 'Time modified', 'Note name', 'Full note content', 'Processed note content', '#_meta/concept', '#_meta/exercise', '#_meta/definition', '#_meta/example', '#_meta/narrative', '#_meta/notation', '#_meta/proof', '#_meta/remark', '#_meta/TODO/split', '#_meta/TODO/merge', '#_meta/TODO/delete', '#_meta/hint', '#_meta/how_to', '#_meta/conjecture', '#_meta/convention', '#_meta/context', '#_meta/permanent_note', '#_meta/question', '#_meta/problem'],
    num_rows: 6
})

Let us view the dataset as a pandas Dataframe

dataset.to_pandas()
Time added Time modified Note name Full note content Processed note content #_meta/concept #_meta/exercise #_meta/definition #_meta/example #_meta/narrative ... #_meta/TODO/merge #_meta/TODO/delete #_meta/hint #_meta/how_to #_meta/conjecture #_meta/convention #_meta/context #_meta/permanent_note #_meta/question #_meta/problem
0 2024-12-16T17:26 2024-12-16T17:26 reference_with_tag_labels_something_something ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/narrative]\n---\n# Topic[^1]\n\nIn this chapter, we describe some basics of ring theory. Rings are mathematical structures which generalize the structures of the familiar integers, rational numbers, real numbers, complex numberes, etc.\n\n\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Kim, Page 1 In this chapter, we describe some basics of ring theory. Rings are mathematical structures which generalize the structures of the familiar integers, rational numbers, real numbers, complex numberes, etc.\n NOT #_meta/concept NOT #_meta/exercise NOT #_meta/definition NOT #_meta/example IS #_meta/narrative ... NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context NOT #_meta/permanent_note NOT #_meta/question NOT #_meta/problem
1 2024-12-16T17:26 2024-12-16T17:26 reference_with_tag_labels_Definition 1 ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/definition]\n---\n# Ring[^1]\n\nA **ring** is a set with binary operators $+$ and $\cdot$ such that ...\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Kim, Definition 1 A ring is a set with binary operators $+$ and $\cdot$ such that ...\n NOT #_meta/concept NOT #_meta/exercise IS #_meta/definition NOT #_meta/example NOT #_meta/narrative ... NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context NOT #_meta/permanent_note NOT #_meta/question NOT #_meta/problem
2 2024-12-16T17:26 2024-12-16T17:26 reference_with_tag_labels_Definition 2 ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/definition, _meta/notation]\n---\n# Ring of integers modulo $n$[^1]\n\nLet $n \geq 1$ be an integer. The **ring of integers modulo $n$**, denoted by **$\mathbb{Z}/n\mathbb{Z}$**, is, informally, the ring whose elements are represented by the integers with the understanding that $0$ and $n$ are equal.\n\nMore precisely, $\mathbb{Z}/n\mathbb{Z}$ has the elements $0,1,\ldots,n-1$.\n\n...\n\n\n# See Also\n- [[reference_with_tag_labels_Exercise 1|reference_with_tag_labels_Z_nZ_is_a_ring]]\n# Meta\n## References\n\n## ... Let $n \geq 1$ be an integer. The ring of integers modulo $n$, denoted by $\mathbb{Z}/n\mathbb{Z}$, is, informally, the ring whose elements are represented by the integers with the understanding that $0$ and $n$ are equal.\n\nMore precisely, $\mathbb{Z}/n\mathbb{Z}$ has the elements $0,1,\ldots,n-1$.\n\n...\n NOT #_meta/concept NOT #_meta/exercise IS #_meta/definition NOT #_meta/example NOT #_meta/narrative ... NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context NOT #_meta/permanent_note NOT #_meta/question NOT #_meta/problem
3 2024-12-16T17:26 2024-12-16T17:26 reference_with_tag_labels_Exercise 1 ---\ncssclass: clean-embeds\naliases: [reference_with_tag_labels_Z_nZ_is_a_ring]\ntags: [_meta/literature_note, _meta/exercise]\n---\n# $\mathbb{Z}/n\mathbb{Z}$ is a ring[^1]\n\nShow that $\mathbb{Z}/n\mathbb{Z}$ is a ring.\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Exercise 1 Show that $\mathbb{Z}/n\mathbb{Z}$ is a ring.\n NOT #_meta/concept IS #_meta/exercise NOT #_meta/definition NOT #_meta/example NOT #_meta/narrative ... NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context NOT #_meta/permanent_note NOT #_meta/question NOT #_meta/problem
4 2024-12-16T17:26 2024-12-16T17:26 reference_with_tag_labels_Theorem 1 ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/concept, _meta/proof]\n---\n# The polynomial ring of a UFD is a UFD[^1]\n\nTheorem 1. Let $R$ be a UFD. Then $R[x]$ is a UFD.\n\nProof. Let $f,g \in R[x]$ and suppose that $fg = 0$. Write $f = \sum_{i=0}^n a_i x^i$ and $g = \sum_{j=0}^m b_j x^j$ for some $a_i,b_j \in R$.\n\n...\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Kim, Theorem 1 Theorem 1. Let $R$ be a UFD. Then $R[x]$ is a UFD.\n\nProof. Let $f,g \in R[x]$ and suppose that $fg = 0$. Write $f = \sum_{i=0}^n a_i x^i$ and $g = \sum_{j=0}^m b_j x^j$ for some $a_i,b_j \in R$.\n\n...\n IS #_meta/concept NOT #_meta/exercise NOT #_meta/definition NOT #_meta/example NOT #_meta/narrative ... NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context NOT #_meta/permanent_note NOT #_meta/question NOT #_meta/problem
5 2024-12-16T17:26 2024-12-16T17:26 reference_with_tag_labels_Theorem 2 ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/concept, _auto/_meta/proof, _meta/definition]\n---\n%%Note that this note introduces a notation and hence actually ought to be labeled with the tag _meta/notation as well; but for the sake of example, the job of adding the _meta/notation tag will be left to the `automatically_add_note_type_tags` function.%%\n\n# The polynomial ring of a UFD is a UFD[^1]\nLet $q$ be the power of a prime number. Up to isomorphism, there is a unique field with $q$ elements. This field is denoted **$\mathbb{F}_q$** and is called the ... %%Note that this note introduces a notation and hence actually ought to be labeled with the tag _meta/notation as well; but for the sake of example, the job of adding the _meta/notation tag will be left to the `automatically_add_note_type_tags` function.%%\n\nLet $q$ be the power of a prime number. Up to isomorphism, there is a unique field with $q$ elements. This field is denoted $\mathbb{F}_q$ and is called the finite field of $q$ elements.\n\nProof. Say that $q = p^k$ and let $F$ be a field with $q$ elements. First note that $F$ has a subfield "generated by $1$", i.e. the elements $0,1,... IS #_meta/concept NOT #_meta/exercise IS #_meta/definition NOT #_meta/example NOT #_meta/narrative ... NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context NOT #_meta/permanent_note NOT #_meta/question NOT #_meta/problem

6 rows × 24 columns

(Deprecated) Gather data into a pandas.DataFrame

source

gather_information_note_types

 gather_information_note_types (vault:os.PathLike,
                                notes:list[trouver.markdown.obsidian.vault
                                .VaultNote],
                                raise_error_that_arises:bool=True)

Return a pandas.DataFrame encapsulating the data of note labels.

Type Default Details
vault PathLike
notes list
raise_error_that_arises bool True
Returns DataFrame Has columns Time added, Time modified, Note name, Full note content, Processed note content as well as columns for each tag label. See append_to_information_note_type_database for more details about these columns.
test_vault = _test_directory() / 'test_vault_6'
index_note = VaultNote(test_vault, name='_index_1_introduction_reference_with_tag_labels')
# There are just 5 notes
notes = notes_linked_in_note(index_note, as_dict=False)
df = gather_information_note_types(test_vault, notes)
test_eq(len(df), 6)
df.head()
C:\Users\hyunj\AppData\Local\Temp\ipykernel_24944\2112522642.py:5: DeprecationWarning: Call to deprecated function (or staticmethod) gather_information_note_types. (Use Using a pandas `DataFrame` is slow. Use `information_note_types_as_dataset` instead to gather data as a `Dataset`.)
  df = gather_information_note_types(test_vault, notes)
c:\Users\hyunj\Documents\Development\Python\trouver_py310_venv\lib\site-packages\bs4\__init__.py:435: MarkupResemblesLocatorWarning: The input looks more like a filename than markup. You may want to open this file and pass the filehandle into Beautiful Soup.
  warnings.warn(
Time added Time modified Note name Full note content Processed note content #_meta/concept #_meta/exercise #_meta/definition #_meta/example #_meta/narrative ... #_meta/TODO/merge #_meta/TODO/delete #_meta/hint #_meta/how_to #_meta/conjecture #_meta/convention #_meta/context #_meta/permanent_note #_meta/question #_meta/problem
0 2024-12-16T17:26 2024-12-16T17:26 reference_with_tag_labels_something_something ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/narrative]\n---\n# Topic[^1]\n\nIn this chapter, we describe some basics of ring theory. Rings are mathematical structures which generalize the structures of the familiar integers, rational numbers, real numbers, complex numberes, etc.\n\n\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Kim, Page 1 In this chapter, we describe some basics of ring theory. Rings are mathematical structures which generalize the structures of the familiar integers, rational numbers, real numbers, complex numberes, etc.\n NOT #_meta/concept NOT #_meta/exercise NOT #_meta/definition NOT #_meta/example IS #_meta/narrative ... NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context NOT #_meta/permanent_note NOT #_meta/question NOT #_meta/problem
1 2024-12-16T17:26 2024-12-16T17:26 reference_with_tag_labels_Definition 1 ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/definition]\n---\n# Ring[^1]\n\nA **ring** is a set with binary operators $+$ and $\cdot$ such that ...\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Kim, Definition 1 A ring is a set with binary operators $+$ and $\cdot$ such that ...\n NOT #_meta/concept NOT #_meta/exercise IS #_meta/definition NOT #_meta/example NOT #_meta/narrative ... NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context NOT #_meta/permanent_note NOT #_meta/question NOT #_meta/problem
2 2024-12-16T17:26 2024-12-16T17:26 reference_with_tag_labels_Definition 2 ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/definition, _meta/notation]\n---\n# Ring of integers modulo $n$[^1]\n\nLet $n \geq 1$ be an integer. The **ring of integers modulo $n$**, denoted by **$\mathbb{Z}/n\mathbb{Z}$**, is, informally, the ring whose elements are represented by the integers with the understanding that $0$ and $n$ are equal.\n\nMore precisely, $\mathbb{Z}/n\mathbb{Z}$ has the elements $0,1,\ldots,n-1$.\n\n...\n\n\n# See Also\n- [[reference_with_tag_labels_Exercise 1|reference_with_tag_labels_Z_nZ_is_a_ring]]\n# Meta\n## References\n\n## ... Let $n \geq 1$ be an integer. The ring of integers modulo $n$, denoted by $\mathbb{Z}/n\mathbb{Z}$, is, informally, the ring whose elements are represented by the integers with the understanding that $0$ and $n$ are equal.\n\nMore precisely, $\mathbb{Z}/n\mathbb{Z}$ has the elements $0,1,\ldots,n-1$.\n\n...\n NOT #_meta/concept NOT #_meta/exercise IS #_meta/definition NOT #_meta/example NOT #_meta/narrative ... NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context NOT #_meta/permanent_note NOT #_meta/question NOT #_meta/problem
3 2024-12-16T17:26 2024-12-16T17:26 reference_with_tag_labels_Exercise 1 ---\ncssclass: clean-embeds\naliases: [reference_with_tag_labels_Z_nZ_is_a_ring]\ntags: [_meta/literature_note, _meta/exercise]\n---\n# $\mathbb{Z}/n\mathbb{Z}$ is a ring[^1]\n\nShow that $\mathbb{Z}/n\mathbb{Z}$ is a ring.\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Exercise 1 Show that $\mathbb{Z}/n\mathbb{Z}$ is a ring.\n NOT #_meta/concept IS #_meta/exercise NOT #_meta/definition NOT #_meta/example NOT #_meta/narrative ... NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context NOT #_meta/permanent_note NOT #_meta/question NOT #_meta/problem
4 2024-12-16T17:26 2024-12-16T17:26 reference_with_tag_labels_Theorem 1 ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/concept, _meta/proof]\n---\n# The polynomial ring of a UFD is a UFD[^1]\n\nTheorem 1. Let $R$ be a UFD. Then $R[x]$ is a UFD.\n\nProof. Let $f,g \in R[x]$ and suppose that $fg = 0$. Write $f = \sum_{i=0}^n a_i x^i$ and $g = \sum_{j=0}^m b_j x^j$ for some $a_i,b_j \in R$.\n\n...\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Kim, Theorem 1 Theorem 1. Let $R$ be a UFD. Then $R[x]$ is a UFD.\n\nProof. Let $f,g \in R[x]$ and suppose that $fg = 0$. Write $f = \sum_{i=0}^n a_i x^i$ and $g = \sum_{j=0}^m b_j x^j$ for some $a_i,b_j \in R$.\n\n...\n IS #_meta/concept NOT #_meta/exercise NOT #_meta/definition NOT #_meta/example NOT #_meta/narrative ... NOT #_meta/TODO/merge NOT #_meta/TODO/delete NOT #_meta/hint NOT #_meta/how_to NOT #_meta/conjecture NOT #_meta/convention NOT #_meta/context NOT #_meta/permanent_note NOT #_meta/question NOT #_meta/problem

5 rows × 24 columns


source

append_to_information_note_type_database

 append_to_information_note_type_database (vault:os.PathLike,
                                           file:os.PathLike, notes:list[tr
                                           ouver.markdown.obsidian.vault.V
                                           aultNote], backup:bool=True)

*Either create a csv file containing data for information note type labels or append to an existing csv file.

The columns of the database file are as follows:

  • Time added - The time when the row was added.
  • Time modified - The time when the labels of the row
  • Note name - The name of the note from which the data for the row was derived.
  • Full note content - The entire content/text of the note.
  • Processed note content - The “raw” content of the note without the YAML frontmatter meta, Markdown headings, links, footnotes, etc.

All timestamps are in UTC time and specify time to minutes (i.e. no seconds/microseconds).

If a “new” note has the same processed content as a pre-existing note and anything is different about the “new” note, then update the row of the existing note. In particular, the following are updated: - Time modified (set to current time) - Note name (overwritten) - Full note content (overwritten) - Columns for categorization (overwritten)

This method assumes that all the processed content in the CSV file are all distinct if the CSV file exists.*

Type Default Details
vault PathLike The vault freom which the data is drawn
file PathLike The path to a CSV file
notes list the notes to add to the database
backup bool True If True, makes a copy of file in the same directory and with the same name, except with an added extension of .bak.
Returns None
with tempfile.TemporaryDirectory(prefix='temp_dir', dir=os.getcwd()) as temp_dir:
    temp_vault = Path(temp_dir) / 'test_vault_6'
    shutil.copytree(_test_directory() / 'test_vault_6', temp_vault)

    index_note = VaultNote(temp_vault, name='_index_1_introduction_reference_with_tag_labels')
    notes = notes_linked_in_note(index_note, as_dict=False)
    file = temp_vault / '_ml_data' / 'information_note_type_labels.csv'
    append_to_information_note_type_database(
         temp_vault, file, notes)

    # Uncomment these lines to see `temp_vault` and its contents.
#     os.startfile(os.getcwd())
#     input()
    df = pd.read_csv(file)
    print(df.head())
C:\Users\hyunj\AppData\Local\Temp\ipykernel_24944\2798749500.py:8: DeprecationWarning: Call to deprecated function (or staticmethod) append_to_information_note_type_database. (Use Using a pandas `DataFrame` is slow. Use `information_note_types_as_dataset` instead to gather data as a `Dataset`.)
  append_to_information_note_type_database(
C:\Users\hyunj\AppData\Local\Temp\ipykernel_24944\381146521.py:42: DeprecationWarning: Call to deprecated function (or staticmethod) gather_information_note_types. (Use Using a pandas `DataFrame` is slow. Use `information_note_types_as_dataset` instead to gather data as a `Dataset`.)
  new_df = gather_information_note_types(vault, notes)
         Time added     Time modified  \
0  2024-12-16T17:26  2024-12-16T17:26   
1  2024-12-16T17:26  2024-12-16T17:26   
2  2024-12-16T17:26  2024-12-16T17:26   
3  2024-12-16T17:26  2024-12-16T17:26   
4  2024-12-16T17:26  2024-12-16T17:26   

                                       Note name  \
0  reference_with_tag_labels_something_something   
1         reference_with_tag_labels_Definition 1   
2         reference_with_tag_labels_Definition 2   
3           reference_with_tag_labels_Exercise 1   
4            reference_with_tag_labels_Theorem 1   

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         Full note content  \
0                                                                                                                                                                                                    ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/narrative]\n---\n# Topic[^1]\n\nIn this chapter, we describe some basics of ring theory. Rings are mathematical structures which generalize the structures of the familiar integers, rational numbers, real numbers, complex numberes, etc.\n\n\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Kim, Page 1   
1                                                                                                                                                                                                                                                                                                                                      ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/definition]\n---\n# Ring[^1]\n\nA **ring** is a set with binary operators $+$ and $\cdot$ such that ...\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Kim, Definition 1   
2  ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/definition, _meta/notation]\n---\n# Ring of integers modulo $n$[^1]\n\nLet $n \geq 1$ be an integer. The **ring of integers modulo $n$**, denoted by **$\mathbb{Z}/n\mathbb{Z}$**, is, informally, the ring whose elements are represented by the integers with the understanding that $0$ and $n$ are equal.\n\nMore precisely, $\mathbb{Z}/n\mathbb{Z}$ has the elements $0,1,\ldots,n-1$.\n\n...\n\n\n# See Also\n- [[reference_with_tag_labels_Exercise 1|reference_with_tag_labels_Z_nZ_is_a_ring]]\n# Meta\n## References\n\n## ...   
3                                                                                                                                                                                                                                                                                                   ---\ncssclass: clean-embeds\naliases: [reference_with_tag_labels_Z_nZ_is_a_ring]\ntags: [_meta/literature_note, _meta/exercise]\n---\n# $\mathbb{Z}/n\mathbb{Z}$ is a ring[^1]\n\nShow that $\mathbb{Z}/n\mathbb{Z}$ is a ring.\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Exercise 1   
4                                                                                                                                                          ---\ncssclass: clean-embeds\naliases: []\ntags: [_meta/literature_note, _meta/concept, _meta/proof]\n---\n# The polynomial ring of a UFD is a UFD[^1]\n\nTheorem 1. Let $R$ be a UFD. Then $R[x]$ is a UFD.\n\nProof. Let $f,g \in R[x]$ and suppose that $fg = 0$. Write $f = \sum_{i=0}^n a_i x^i$ and $g = \sum_{j=0}^m b_j x^j$ for some $a_i,b_j \in R$.\n\n...\n\n# See Also\n\n# Meta\n## References\n\n## Citations and Footnotes\n[^1]: Kim, Theorem 1   

                                                                                                                                                                                                                                                                                                   Processed note content  \
0                                                                                                           In this chapter, we describe some basics of ring theory. Rings are mathematical structures which generalize the structures of the familiar integers, rational numbers, real numbers, complex numberes, etc.\n   
1                                                                                                                                                                                                                                                   A ring is a set with binary operators $+$ and $\cdot$ such that ...\n   
2  Let $n \geq 1$ be an integer. The ring of integers modulo $n$, denoted by $\mathbb{Z}/n\mathbb{Z}$, is, informally, the ring whose elements are represented by the integers with the understanding that $0$ and $n$ are equal.\n\nMore precisely, $\mathbb{Z}/n\mathbb{Z}$ has the elements $0,1,\ldots,n-1$.\n\n...\n   
3                                                                                                                                                                                                                                                                         Show that $\mathbb{Z}/n\mathbb{Z}$ is a ring.\n   
4                                                                                                           Theorem 1. Let $R$ be a UFD. Then $R[x]$ is a UFD.\n\nProof. Let $f,g \in R[x]$ and suppose that $fg = 0$. Write $f = \sum_{i=0}^n a_i x^i$ and $g = \sum_{j=0}^m b_j x^j$ for some $a_i,b_j \in R$.\n\n...\n   

       #_meta/concept      #_meta/exercise      #_meta/definition  \
0  NOT #_meta/concept  NOT #_meta/exercise  NOT #_meta/definition   
1  NOT #_meta/concept  NOT #_meta/exercise   IS #_meta/definition   
2  NOT #_meta/concept  NOT #_meta/exercise   IS #_meta/definition   
3  NOT #_meta/concept   IS #_meta/exercise  NOT #_meta/definition   
4   IS #_meta/concept  NOT #_meta/exercise  NOT #_meta/definition   

       #_meta/example      #_meta/narrative  ...      #_meta/TODO/merge  \
0  NOT #_meta/example   IS #_meta/narrative  ...  NOT #_meta/TODO/merge   
1  NOT #_meta/example  NOT #_meta/narrative  ...  NOT #_meta/TODO/merge   
2  NOT #_meta/example  NOT #_meta/narrative  ...  NOT #_meta/TODO/merge   
3  NOT #_meta/example  NOT #_meta/narrative  ...  NOT #_meta/TODO/merge   
4  NOT #_meta/example  NOT #_meta/narrative  ...  NOT #_meta/TODO/merge   

       #_meta/TODO/delete      #_meta/hint      #_meta/how_to  \
0  NOT #_meta/TODO/delete  NOT #_meta/hint  NOT #_meta/how_to   
1  NOT #_meta/TODO/delete  NOT #_meta/hint  NOT #_meta/how_to   
2  NOT #_meta/TODO/delete  NOT #_meta/hint  NOT #_meta/how_to   
3  NOT #_meta/TODO/delete  NOT #_meta/hint  NOT #_meta/how_to   
4  NOT #_meta/TODO/delete  NOT #_meta/hint  NOT #_meta/how_to   

       #_meta/conjecture      #_meta/convention      #_meta/context  \
0  NOT #_meta/conjecture  NOT #_meta/convention  NOT #_meta/context   
1  NOT #_meta/conjecture  NOT #_meta/convention  NOT #_meta/context   
2  NOT #_meta/conjecture  NOT #_meta/convention  NOT #_meta/context   
3  NOT #_meta/conjecture  NOT #_meta/convention  NOT #_meta/context   
4  NOT #_meta/conjecture  NOT #_meta/convention  NOT #_meta/context   

       #_meta/permanent_note      #_meta/question      #_meta/problem  
0  NOT #_meta/permanent_note  NOT #_meta/question  NOT #_meta/problem  
1  NOT #_meta/permanent_note  NOT #_meta/question  NOT #_meta/problem  
2  NOT #_meta/permanent_note  NOT #_meta/question  NOT #_meta/problem  
3  NOT #_meta/permanent_note  NOT #_meta/question  NOT #_meta/problem  
4  NOT #_meta/permanent_note  NOT #_meta/question  NOT #_meta/problem  

[5 rows x 24 columns]
c:\Users\hyunj\Documents\Development\Python\trouver_py310_venv\lib\site-packages\bs4\__init__.py:435: MarkupResemblesLocatorWarning: The input looks more like a filename than markup. You may want to open this file and pass the filehandle into Beautiful Soup.
  warnings.warn(

Use the trained model to predict note types

After training the model (cf. how_to.train_ml_model.fastai), we can now predict the types of notes


source

possible_text_type_labels

 possible_text_type_labels (learn:fastai.text.learner.TextLearner)

Return the possible labels outputted by learn.predict


source

predict_text_types_with_one_learner

 predict_text_types_with_one_learner
                                      (learner:fastai.text.learner.TextLea
                                      rner, texts:list[str],
                                      remove_NO_TAG:bool=True,
                                      include_probabilities:bool=False)

Predict the types of mathematical texts using an ML model.

Type Default Details
learner TextLearner The ML models predicting note types.
texts list
remove_NO_TAG bool True If True, remove NO_TAG, which in theory is supposed to indicate that no types are predicted, but in practice can somehow be predicted along with actual types.
include_probabilities bool False If True, then
Returns list Each list or tuple corresponds to each entry from text and contains the predicted types of the text. A list[str] consists of the predicted types/labels of the text and a tuple[list[str], dict[str,float]] contains the list of predicted types along with a dict of all possible types predictable by learn along with probabilities.

We can predict types of short mathematical texts. Say that the information note type classification model, trained in how_to.train_ml_model.fastai is loaded, e.g. via fastai’s load_learner function:

model = load_learner(<path_to_model>)
texts_to_predict = [
    '',
    'In this chapter, we introduce the notion of rings, some related notions, and many examples.',
    'A ring is a set equipped with two binary operators $+$ and $\cdot$ such that...',
    'Theorem. For every prime power $q$, there is, up to isomorphism, exactly one field with $q$ elements.\n\nProof. Let $q = p^k$ where $p$ is a prime. Let $F$ be a field with $q$ elements. Note that...',
    'Remark. Note that $\mathbb{F}_q$ and $\mathbb{Z}/q\mathbb{Z}$ are different rings',
    'As an example, take $\mathbb{F}_9$. It can be presented as $\mathbb{F}_3[x^2+1]$ as well as $\mathbb{F}_3[x^2+x+2]$.'
]
sample_outputs = predict_text_types_with_one_learner(
    model, texts_to_predict)

print(sample_outputs)
[['#_meta/TODO/delete', '#_meta/TODO/merge', '#_meta/TODO/split', '#_meta/concept', '#_meta/proof'], ['#_meta/TODO/split'], ['#_meta/TODO/split', '#_meta/definition'], ['#_meta/concept', '#_meta/proof'], ['#_meta/TODO/split', '#_meta/remark'], ['#_meta/example', '#_meta/narrative']]
possible_text_type_labels(model)
sample_outputs = predict_text_types_with_one_learner(model, texts_to_predict, include_probabilities=True)
print(sample_outputs[0][0])
print(sample_outputs[0][1])
print(sample_outputs[0][1])
['#_meta/TODO/delete', '#_meta/TODO/merge', '#_meta/TODO/split', '#_meta/concept', '#_meta/proof']
{'#_meta/TODO/delete': 1.0, '#_meta/TODO/merge': 0.8483227491378784, '#_meta/TODO/split': 0.9903709292411804, '#_meta/concept': 0.6789883375167847, '#_meta/conjecture': 8.332841361369248e-12, '#_meta/convention': 0.3710557222366333, '#_meta/definition': 4.3158637708984315e-05, '#_meta/example': 4.248961886332836e-06, '#_meta/exercise': 1.3349553228181321e-05, '#_meta/how_to': 6.355472578434274e-05, '#_meta/narrative': 7.353986575253657e-07, '#_meta/notation': 1.661649002926424e-05, '#_meta/proof': 0.9312137365341187, '#_meta/remark': 0.001952954800799489, 'NO_TAG': 4.766645744780362e-09}
{'#_meta/TODO/delete': 1.0, '#_meta/TODO/merge': 0.8483227491378784, '#_meta/TODO/split': 0.9903709292411804, '#_meta/concept': 0.6789883375167847, '#_meta/conjecture': 8.332841361369248e-12, '#_meta/convention': 0.3710557222366333, '#_meta/definition': 4.3158637708984315e-05, '#_meta/example': 4.248961886332836e-06, '#_meta/exercise': 1.3349553228181321e-05, '#_meta/how_to': 6.355472578434274e-05, '#_meta/narrative': 7.353986575253657e-07, '#_meta/notation': 1.661649002926424e-05, '#_meta/proof': 0.9312137365341187, '#_meta/remark': 0.001952954800799489, 'NO_TAG': 4.766645744780362e-09}

source

consolidate_single_text_predictions_by_sum_of_confidence

 consolidate_single_text_predictions_by_sum_of_confidence
                                                           (predictions_fo
                                                           r_single_text:l
                                                           ist[tuple[list[
                                                           str],dict[str,f
                                                           loat]]])

*Consolidate single text predictions by summing the “probabilities” predicted by the various models. If the sum of the probabilities that the label should be predicted is greater than the sum of the probabilities that the label should not be predicted, then the label is predicted.

This is a sample input to the consolidation parameter of the predict_note_types function*

Type Details
predictions_for_single_text list Each tuple corresponds to the predictions made by each model.
Returns list The labels

source

predict_note_types

 predict_note_types (learners:fastai.text.learner.TextLearner|list[fastai.
                     text.learner.TextLearner], vault:os.PathLike, notes:l
                     ist[trouver.markdown.obsidian.vault.VaultNote],
                     remove_NO_TAG:bool=True,
                     consolidation:Optional[Callable]=<function consolidat
                     e_single_text_predictions_by_sum_of_confidence>)

*Parameters

Returns*

Type Default Details
learners fastai.text.learner.TextLearner | list[fastai.text.learner.TextLearner] The ML models predicting note types.
vault PathLike The vault with the notes.
notes list The notes with texts to predict
remove_NO_TAG bool True If True, remove NO_TAG, which in theory is supposed to indicate that no types are predicted, but in practice can somehow be predicted along with actual types.
consolidation Optional consolidate_single_text_predictions_by_sum_of_confidence The method to consolidate between different predictions made by the possibly more-than-one model in learners.
Returns list Each list[str]`` corresponds to an item innotes` and contains the predicted note types for that note.
# TODO: tests
# predict_note_types

test_vault = _test_directory() / 'test_vault_6'
notes_to_predict = [
    VaultNote(test_vault, name='reference_with_tag_labels_Theorem 1'),
    VaultNote(test_vault, name='reference_with_tag_labels_Definition 1')
]
sample_outputs = predict_note_types(model, test_vault, notes_to_predict)

print(
    f'The following are the raw content of the notes without'
    f'metadata along with the model\'s predictions for their types:\n\n')
for note, prediction in zip(notes_to_predict, sample_outputs):
    print(process_standard_information_note(MarkdownFile.from_vault_note(note), test_vault) )
    print(prediction, '\n\n')
c:\Users\hyunj\Documents\Development\Python\trouver_py310_venv\lib\site-packages\bs4\__init__.py:435: MarkupResemblesLocatorWarning: The input looks more like a filename than markup. You may want to open this file and pass the filehandle into Beautiful Soup.
  warnings.warn(
The following are the raw content of the notes withoutmetadata along with the model's predictions for their types:


Theorem 1. Let $R$ be a UFD. Then $R[x]$ is a UFD.

Proof. Let $f,g \in R[x]$ and suppose that $fg = 0$. Write $f = \sum_{i=0}^n a_i x^i$ and $g = \sum_{j=0}^m b_j x^j$ for some $a_i,b_j \in R$.

...

['#_meta/concept', '#_meta/proof'] 


A ring is a set with binary operators $+$ and $\cdot$ such that ...

['#_meta/definition', '#_meta/TODO/split'] 

c:\Users\hyunj\Documents\Development\Python\trouver_py310_venv\lib\site-packages\bs4\__init__.py:435: MarkupResemblesLocatorWarning: The input looks more like a filename than markup. You may want to open this file and pass the filehandle into Beautiful Soup.
  warnings.warn(
with (mock.patch('__main__.MarkdownFile.from_vault_note') as mock_markdownfile_from_vault_note,
          mock.patch('__main__.process_standard_information_note') as mock_process_standard_information_note,
          mock.patch('__main__.load_learner') as mock_load_learner,
          mock.patch('__main__.possible_text_type_labels') as mock_possible_text_type_labels):
      mock_path, mock_vault = None, None
      mock_notes = [None, None, None, None, None]
      mock_learner = load_learner(mock_path)
      mock_possible_text_type_labels.return_value = ['#_meta/TODO/delete', '#_meta/TODO/merge', '#_meta/TODO/split', '#_meta/concept', '#_meta/conjecture', '#_meta/convention', '#_meta/definition', '#_meta/example', '#_meta/exercise', '#_meta/how_to', '#_meta/narrative', '#_meta/notation', '#_meta/proof', '#_meta/remark', 'NO_TAG']
      mock_learner.predict.side_effect = [
          (['#_meta/TODO/delete','#_meta/TODO/split','#_meta/concept','#_meta/convention'],
            tensor([ True, False,  True,  True, False,  True, False, False, False, False,
                    False, False, False, False, False]),
            tensor([1.0000e+00, 1.0748e-02, 8.5110e-01, 8.2098e-01, 3.0918e-09, 8.9845e-01,
                    6.3798e-03, 1.1441e-06, 2.3794e-07, 1.7402e-02, 3.2704e-06, 1.6747e-03,
                    2.1008e-01, 3.8969e-04, 3.7873e-07])),
          (['#_meta/narrative'],
            tensor([False, False, False, False, False, False, False, False, False, False,
                      True, False, False, False, False]),
            tensor([8.4556e-06, 3.2360e-06, 2.0235e-01, 1.1844e-03, 5.1291e-08, 2.9886e-07,
                    4.8174e-04, 8.9895e-06, 1.3379e-09, 2.8261e-11, 9.9701e-01, 5.8664e-04,
                    2.6031e-03, 1.4012e-02, 1.3595e-03])),
          (['#_meta/TODO/delete', 'NO_TAG'], tensor([ True, False, False, False, False, False, False, False, False, False,
            False, False, False,  True]), tensor([6.1455e-01, 1.6902e-01, 1.4254e-02, 5.1358e-02, 4.8857e-05, 3.0853e-04,
            6.0457e-03, 1.2064e-04, 8.5651e-02, 1.3941e-02, 5.2413e-04, 1.3709e-01,
            9.7726e-06, 0.00, 9.7614e-01])),
          (['#_meta/concept', '#_meta/proof'], tensor([False, False, False,  True, False, False, False, False, False, False,
            False,  True, False, False]), tensor([4.0871e-03, 3.6683e-04, 1.6594e-01, 9.7876e-01, 0.00, 6.0281e-05, 8.7817e-06,
            1.9275e-02, 1.5589e-03, 4.5301e-03, 7.8989e-03, 1.2528e-02, 9.2800e-01, 
            9.4636e-04, 1.4658e-02])),
          (['NO_TAG'], tensor([ True, False, False, False, False, False, False, False, False, False,
            False, False, False,  True]), tensor([6.1455e-02, 1.6902e-01, 1.4254e-02, 5.1358e-02, 4.8857e-05, 3.0853e-04,
            6.0457e-03, 1.2064e-04, 8.5651e-02, 1.3941e-02, 5.2413e-04, 1.3709e-01,
            9.7726e-06, 0.00, 9.7614e-01])),
      ]
      prediction = predict_note_types(mock_learner, mock_vault, mock_notes)
      correct_value = [['#_meta/TODO/delete', '#_meta/TODO/split', '#_meta/convention', '#_meta/concept'],
         ['#_meta/narrative'],
         ['#_meta/TODO/delete'],
         ['#_meta/proof', '#_meta/concept'],
         []]
      test_eq(len(prediction), len(correct_value))
      for first, second in zip(prediction, correct_value):
          try:
              test(first, second, all_equal)
          except AssertionError:
               test_shuffled(first, second)

      mock_notes = [None, None]
      mock_learner.predict.side_effect = [
          (['#_meta/TODO/delete', 'NO_TAG'], tensor([ True, False, False, False, False, False, False, False, False, False,
            False, False, False,  True]), tensor([6.1455e-01, 1.6902e-01, 1.4254e-02, 5.1358e-02, 4.8857e-05, 3.0853e-04,
            6.0457e-03, 1.2064e-04, 8.5651e-02, 1.3941e-02, 5.2413e-04, 1.3709e-01,
            9.7726e-06, 0.00, 9.7614e-01])),
          (['NO_TAG'], tensor([ True, False, False, False, False, False, False, False, False, False,
            False, False, False,  True]), tensor([6.1455e-02, 1.6902e-01, 1.4254e-02, 5.1358e-02, 4.8857e-05, 3.0853e-04,
            6.0457e-03, 1.2064e-04, 8.5651e-02, 1.3941e-02, 5.2413e-04, 1.3709e-01,
            9.7726e-06, 0.00, 9.7614e-01])),
      ]

      prediction = predict_note_types(mock_learner, mock_vault, mock_notes, remove_NO_TAG=False)
      correct_value = [['NO_TAG', '#_meta/TODO/delete'],
         ['NO_TAG']]
      test_eq(len(prediction), len(correct_value))
      for first, second in zip(prediction, correct_value):
          try:
              test(first, second, all_equal)
          except AssertionError:
               test_shuffled(first, second)

source

automatically_add_note_type_tags

 automatically_add_note_type_tags (learners:fastai.text.learner.TextLearne
                                   r|list[fastai.text.learner.TextLearner]
                                   , vault:os.PathLike, notes:list[trouver
                                   .markdown.obsidian.vault.VaultNote],
                                   add_auto_label:bool=True,
                                   overwrite:Optional[str]=None)

*Predict note types and add the predicted types as frontmatter YAML tags in the notes.

Non-_auto-labeled tags take precedent over auto-labeled tags, unless overwrite='w'.

Raises

  • Warning:
    • If overwrite=None, a note already has some note type tags, and learn predicts different note types as those in the note.*
Type Default Details
learners fastai.text.learner.TextLearner | list[fastai.text.learner.TextLearner] The ML model(s) predicting note types.
vault PathLike The vault with the notes
notes list
add_auto_label bool True If True, adds "_auto" to the front of the note type tag to indicate that the tags were added via this automated script.
overwrite Optional None Either 'w', 'ws', 'ww', 'a', or None. If 'w' or 'ws', then overwrite any already-existing note type tags (from LABEL_TAGS), whether or not these tags are _auto tags, with the predicted tags. IF 'ww', then overwrite only the _auto tags among the already-existing note type tags with the predicted tags. If 'a', then preserve already-existing note type tags and just append the newly predicted ones; in the case that learn predicts the note type whose tag is already in the note, a new tag of that type is not added, even if add_auto_label=True. If None, then do not make modifications to each note if any note type tags already exist in the note; if the predicted note types are different from the already existing note types, then raise a warning.
Returns None

In the below examples, we use mock.patch to test adding note types without testing the ML model itself. In particular, we pretend as though the ML model returns certain predictions (technically, we pretend as though predict_note_types return certain values) to construct these examples.

The following example demonstrates a basic use case of adding predicted note type tags to notes without any note type tags:

# Here we just test adding a note type without testing the ML model itself.
with (tempfile.TemporaryDirectory(prefix='temp_dir', dir=os.getcwd()) as temp_dir,
      mock.patch('__main__.predict_note_types') as mock_predict_note_types):
    temp_vault = Path(temp_dir) / 'test_vault_6'
    shutil.copytree(_test_directory() / 'test_vault_6', temp_vault)

    # Example where `add_auto_label` is `True`
    mock_predict_note_types.return_value = [['#_meta/definition'], ['#_meta/concept', '#_meta/proof']]
    vn1 = VaultNote(temp_vault, name='reference_without_tag_labels_Definition 1')
    vn2 = VaultNote(temp_vault, name='reference_without_tag_labels_Theorem 1')
    notes = [vn1, vn2]
    mock_learn = None
    automatically_add_note_type_tags(mock_learn, temp_vault, notes)

    # Note that _auto/_meta/definition has been added
    print(vn1.text())
    assert (MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/definition'))
    assert (MarkdownFile.from_vault_note(vn2).has_tag('_auto/_meta/concept'))
    assert (MarkdownFile.from_vault_note(vn2).has_tag('_auto/_meta/proof'))


    # Examle where `add_auto_label` is `False`
    mock_predict_note_types.return_value = [['#_meta/definition', '#_meta/notation']]
    notes = [VaultNote(temp_vault, name='reference_without_tag_labels_Definition 2')]
    automatically_add_note_type_tags(mock_learn, temp_vault, notes, add_auto_label=False)
    assert (MarkdownFile.from_vault_note(notes[0]).has_tag('_meta/definition'))
    assert (MarkdownFile.from_vault_note(notes[0]).has_tag('_meta/notation'))
---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _auto/_meta/definition]
---
# Ring[^1]

A **ring** is a set with binary operators $+$ and $\cdot$ such that ...

# See Also

# Meta
## References

## Citations and Footnotes
[^1]: Kim, Definition 1

In the following example, overwrite is set to 'w' (or 'ws'), so all preexisting note type tags are removed before the predicted ones are added:

with (tempfile.TemporaryDirectory(prefix='temp_dir', dir=os.getcwd()) as temp_dir,
      mock.patch('__main__.predict_note_types') as mock_predict_note_types):
    temp_vault = Path(temp_dir) / 'test_vault_6'
    shutil.copytree(_test_directory() / 'test_vault_6', temp_vault)

    mock_predict_note_types.return_value = [['#_meta/definition']]
    vn1 = VaultNote(temp_vault, name='reference_with_tag_labels_Definition 1')
    notes = [vn1]
    mock_learn = None
    automatically_add_note_type_tags(mock_learn, temp_vault, notes, overwrite='w')

    # Note that _meta/definition has been removed and _auto/_meta/definition has been added.
    print(vn1.text())
    assert MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/definition')
    assert not MarkdownFile.from_vault_note(vn1).has_tag('_meta/definition')
---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _auto/_meta/definition]
---
# Ring[^1]

A **ring** is a set with binary operators $+$ and $\cdot$ such that ...

# See Also

# Meta
## References

## Citations and Footnotes
[^1]: Kim, Definition 1

In the following example, overwrite is set to 'ww', so only the _auto note type tags are removed and the predicted ones are added:

# TODO: change example
with (tempfile.TemporaryDirectory(prefix='temp_dir', dir=os.getcwd()) as temp_dir,
      mock.patch('__main__.predict_note_types') as mock_predict_note_types):
    temp_vault = Path(temp_dir) / 'test_vault_6'
    shutil.copytree(_test_directory() / 'test_vault_6', temp_vault)

    mock_predict_note_types.return_value = [['#_meta/definition']]
    vn1 = VaultNote(temp_vault, name='reference_with_tag_labels_Definition 1')
    notes = [vn1]
    mock_learn = None
    automatically_add_note_type_tags(mock_learn, temp_vault, notes, overwrite='w')

    # Note that _meta/definition has been removed and _auto/_meta/definition has been added.
    print(vn1.text())
    assert MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/definition')
    assert not MarkdownFile.from_vault_note(vn1).has_tag('_meta/definition')
---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _auto/_meta/definition]
---
# Ring[^1]

A **ring** is a set with binary operators $+$ and $\cdot$ such that ...

# See Also

# Meta
## References

## Citations and Footnotes
[^1]: Kim, Definition 1

In the following example, overwrite is set to 'a', so only newly predicted note type tags are added

with (tempfile.TemporaryDirectory(prefix='temp_dir', dir=os.getcwd()) as temp_dir,
      mock.patch('__main__.predict_note_types') as mock_predict_note_types):
    temp_vault = Path(temp_dir) / 'test_vault_6'
    shutil.copytree(_test_directory() / 'test_vault_6', temp_vault)

    mock_predict_note_types.return_value = [['#_meta/definition', '#_meta/notation', '#_meta/concept', '#_meta/proof']]
    vn1 = VaultNote(temp_vault, name='reference_with_tag_labels_Theorem 2')
    notes = [vn1]
    mock_learn = None
    automatically_add_note_type_tags(mock_learn, temp_vault, notes, overwrite='a')

    # Example with `add_auto_label=True`
    # Note that _auto/_meta/notation has been added, but _meta/definition, _meta/concept,
    # and #_auto/_meta/proof remain unchanged. Moreover, _auto/_meta/definition and _auto/_meta/cocnept are
    # NOT added.
    print(vn1.text())
    assert MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/notation')
    assert MarkdownFile.from_vault_note(vn1).has_tag('_meta/definition')
    assert MarkdownFile.from_vault_note(vn1).has_tag('_meta/concept')
    assert MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/proof')
    assert not MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/definition')
    assert not MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/concept')


with (tempfile.TemporaryDirectory(prefix='temp_dir', dir=os.getcwd()) as temp_dir,
      mock.patch('__main__.predict_note_types') as mock_predict_note_types):
    temp_vault = Path(temp_dir) / 'test_vault_6'
    shutil.copytree(_test_directory() / 'test_vault_6', temp_vault)

    mock_predict_note_types.return_value = [['#_meta/definition', '#_meta/notation', '#_meta/concept', '#_meta/proof']]
    vn1 = VaultNote(temp_vault, name='reference_with_tag_labels_Theorem 2')
    notes = [vn1]
    mock_learn = None
    automatically_add_note_type_tags(mock_learn, temp_vault, notes, overwrite='a', add_auto_label=False)

    # Example with `add_auto_label=False`
    # Note that _meta/notation has been added, and _auto/_meta/proof is replaced
    # with _meta/proof, but _meta/definition and _meta/concept remain unchanged.
    print(vn1.text())
    assert MarkdownFile.from_vault_note(vn1).has_tag('_meta/notation')
    assert MarkdownFile.from_vault_note(vn1).has_tag('_meta/proof')
    assert not MarkdownFile.from_vault_note(vn1).has_tag('_auto/_meta/proof')
    assert MarkdownFile.from_vault_note(vn1).has_tag('_meta/definition')
    assert MarkdownFile.from_vault_note(vn1).has_tag('_meta/concept')
---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _meta/concept, _auto/_meta/proof, _meta/definition, _auto/_meta/notation]
---
%%Note that this note introduces a notation and hence actually ought to be labeled with the tag _meta/notation as well; but for the sake of example, the job of adding the _meta/notation tag will be left to the `automatically_add_note_type_tags` function.%%

# The polynomial ring of a UFD is a UFD[^1]
Let $q$ be the power of a prime number. Up to isomorphism, there is a unique field with $q$ elements. This field is denoted **$\mathbb{F}_q$** and is called the **finite field of $q$ elements**.

Proof. Say that $q = p^k$ and let $F$ be a field with $q$ elements. First note that $F$ has a subfield "generated by $1$", i.e. the elements $0,1,\ldots,p-1$ form a subfield of $F$.

# See Also

# Meta
## References

## Citations and Footnotes
[^1]: Kim, Theorem 2
---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _meta/concept, _meta/definition, _meta/proof, _meta/notation]
---
%%Note that this note introduces a notation and hence actually ought to be labeled with the tag _meta/notation as well; but for the sake of example, the job of adding the _meta/notation tag will be left to the `automatically_add_note_type_tags` function.%%

# The polynomial ring of a UFD is a UFD[^1]
Let $q$ be the power of a prime number. Up to isomorphism, there is a unique field with $q$ elements. This field is denoted **$\mathbb{F}_q$** and is called the **finite field of $q$ elements**.

Proof. Say that $q = p^k$ and let $F$ be a field with $q$ elements. First note that $F$ has a subfield "generated by $1$", i.e. the elements $0,1,\ldots,p-1$ form a subfield of $F$.

# See Also

# Meta
## References

## Citations and Footnotes
[^1]: Kim, Theorem 2

In the following example, overwrite is set to None. The notes are not modified, but if the note type tags in a note do not match the predicted ones, then a warning is raised

# TODO: add example

Convert _auto/ tags to regular tags

After checking that the automatically predicted tags are correct, we can convert them to regular tags.


source

convert_auto_tags_to_regular_tags_in_notes

 convert_auto_tags_to_regular_tags_in_notes
                                             (notes:list[trouver.markdown.
                                             obsidian.vault.VaultNote], ex
                                             clude:list[str]=['links_added
                                             ', 'notations_added'])

Convert the auto tags into regular tags for the notes.

Type Default Details
notes list
exclude list [‘links_added’, ‘notations_added’] The tags whose _auto/ tags should not be converted. The str should not start with '#' and should not start with '_auto/'.
Returns None
with (tempfile.TemporaryDirectory(prefix='temp_dir', dir=os.getcwd()) as temp_dir):
    temp_vault = Path(temp_dir) / 'test_vault_6'
    shutil.copytree(_test_directory() / 'test_vault_6', temp_vault)

    vn = VaultNote(temp_vault, name='reference_with_tag_labels_Theorem 2')
    convert_auto_tags_to_regular_tags_in_notes([vn])
    print(vn.text())
    mf = MarkdownFile.from_vault_note(vn)
    assert mf.has_tag('_meta/proof')
    assert not mf.has_tag('_auto/_meta/proof')
---
cssclass: clean-embeds
aliases: []
tags: [_meta/literature_note, _meta/concept, _meta/proof, _meta/definition]
---
%%Note that this note introduces a notation and hence actually ought to be labeled with the tag _meta/notation as well; but for the sake of example, the job of adding the _meta/notation tag will be left to the `automatically_add_note_type_tags` function.%%

# The polynomial ring of a UFD is a UFD[^1]
Let $q$ be the power of a prime number. Up to isomorphism, there is a unique field with $q$ elements. This field is denoted **$\mathbb{F}_q$** and is called the **finite field of $q$ elements**.

Proof. Say that $q = p^k$ and let $F$ be a field with $q$ elements. First note that $F$ has a subfield "generated by $1$", i.e. the elements $0,1,\ldots,p-1$ form a subfield of $F$.

# See Also

# Meta
## References

## Citations and Footnotes
[^1]: Kim, Theorem 2

Footnotes

  1. Note that the tag in the YAML frontmatter meta is notated as _meta/definition, which lacks the starting hashtag #.↩︎