Add minimal handling for broken symlinks in generate_output_content (#32 )

* Add minimal handling for broken symlinks in generate_output_content * core: simplify generate_output_content * pylint adjust no-else-return
Bump version to 0.8.0
2025-12-06 03:22:23 -08:00 · 2025-10-28 09:27:44 +01:00 · 2025-10-25 15:33:35 +02:00 · 2025-10-25 15:11:43 +02:00 · 2025-10-25 15:02:46 +02:00 · 2025-10-25 15:02:18 +02:00
11 changed files with 2014 additions and 82 deletions
--- a/.cursor/index.mdc
+++ b/.cursor/index.mdc
@ -1,3 +1,7 @@
 ---
 alwaysApply: true
 ---
 # repo-to-text
 ## Project Overview
--- a/.github/workflows/tests.yml
+++ b/.github/workflows/tests.yml
@ -14,7 +14,7 @@ jobs:
    runs-on: ubuntu-latest
    strategy:
      matrix:
-        python-version: ["3.8", "3.11", "3.13"]
+        python-version: ["3.9", "3.11", "3.13"]
    steps:
    - uses: actions/checkout@v4
--- a/.repo-to-text-settings.yaml
+++ b/.repo-to-text-settings.yaml
@ -18,3 +18,8 @@ ignore-content:
  - "README.md"
  - "LICENSE"
  - "tests/"
 # Optional: Maximum number of words per output file before splitting.
 # If not specified or null, no splitting based on word count will occur.
 # Must be a positive integer if set.
 # maximum_word_count_per_file: 10000
--- a/AGENTS.md
+++ b/AGENTS.md
@ -0,0 +1 @@
 .cursor/index.mdc
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -0,0 +1 @@
 .cursor/index.mdc
--- a/README.md
+++ b/README.md
@ -205,6 +205,13 @@ You can copy this file from the [existing example in the project](https://github
 - **ignore-content**: Ignore files and directories only for the contents sections.
 Using these settings, you can control which files and directories are included or excluded from the final text file.
 - **maximum_word_count_per_file**: Optional integer. Sets a maximum word count for each output file. If the total content exceeds this limit, the output will be split into multiple files. The split files will be named using the convention `output_filename_part_N.txt`, where `N` is the part number.
  Example:
  ```yaml
  # Optional: Maximum word count per output file.
  # If set, the output will be split into multiple files if the total word count exceeds this.
  # maximum_word_count_per_file: 10000
  ```
 ### Wildcards and Inclusions
--- a/poetry.lock
+++ b/poetry.lock
--- a/pyproject.toml
+++ b/pyproject.toml
@ -4,13 +4,13 @@ build-backend = "hatchling.build"
 [project]
 name = "repo-to-text"
-version = "0.6.0"
+version = "0.8.0"
 authors = [
    { name = "Kirill Markin", email = "markinkirill@gmail.com" },
 ]
 description = "Convert a directory structure and its contents into a single text file, including the tree output and file contents in structured XML format. It may be useful to chat with LLM about your code."
 readme = "README.md"
-requires-python = ">=3.6"
+requires-python = ">=3.9"
 license = { text = "MIT" }
 classifiers = [
    "Programming Language :: Python :: 3",
@ -48,3 +48,7 @@ dev = [
 disable = [
    "C0303",
 ]
--- a/repo_to_text/cli/cli.py
+++ b/repo_to_text/cli/cli.py
@ -39,6 +39,11 @@ def create_default_settings_file() -> None:
          - "README.md"
          - "LICENSE"
          - "package-lock.json"
        # Optional: Maximum number of words per output file before splitting.
        # If not specified or null, no splitting based on word count will occur.
        # Must be a positive integer if set.
        # maximum_word_count_per_file: 10000
    """)
    with open('.repo-to-text-settings.yaml', 'w', encoding='utf-8') as f:
        f.write(default_settings)
--- a/repo_to_text/core/core.py
+++ b/repo_to_text/core/core.py
@ -4,11 +4,12 @@ Core functionality for repo-to-text
 import os
 import subprocess
 import platform
 from typing import Tuple, Optional, List, Dict, Any, Set
 from datetime import datetime, timezone
 from importlib.machinery import ModuleSpec
 import logging
-import yaml
+import yaml # type: ignore
 import pathspec
 from pathspec import PathSpec
@ -36,12 +37,20 @@ def get_tree_structure(
 def run_tree_command(path: str) -> str:
    """Run the tree command and return its output."""
    if platform.system() == "Windows":
        cmd = ["cmd", "/c", "tree", "/a", "/f", path]
    else:
        cmd = ["tree", "-a", "-f", "--noreport", path]
    result = subprocess.run(
-        ['tree', '-a', '-f', '--noreport', path],
+        cmd,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        text=True,
        encoding='utf-8',
        check=True
    )
-    return result.stdout.decode('utf-8')
+    return result.stdout
 def filter_tree_output(
        tree_output: str,
@ -74,7 +83,22 @@ def process_line(
    if not full_path or full_path == '.':
        return None
    try:
        relative_path = os.path.relpath(full_path, path).replace(os.sep, '/')
    except (ValueError, OSError) as e:
        # Handle case where relpath fails (e.g., in CI when cwd is unavailable)
        # Use absolute path conversion as fallback
        logging.debug('os.path.relpath failed for %s, using fallback: %s', full_path, e)
        if os.path.isabs(full_path) and os.path.isabs(path):
            # Both are absolute, try manual relative calculation
            try:
                common = os.path.commonpath([full_path, path])
                relative_path = os.path.relpath(full_path, common).replace(os.sep, '/')
            except (ValueError, OSError):
                # Last resort: use just the filename
                relative_path = os.path.basename(full_path)
        else:
            relative_path = os.path.basename(full_path)
    if should_ignore_file(
        full_path,
@ -128,16 +152,21 @@ def load_ignore_specs(
    repo_settings_path = os.path.join(path, '.repo-to-text-settings.yaml')
    if os.path.exists(repo_settings_path):
-        logging.debug('Loading .repo-to-text-settings.yaml from path: %s', repo_settings_path)
+        logging.debug(
            'Loading .repo-to-text-settings.yaml for ignore specs from path: %s',
            repo_settings_path
        )
        with open(repo_settings_path, 'r', encoding='utf-8') as f:
            settings: Dict[str, Any] = yaml.safe_load(f)
            use_gitignore = settings.get('gitignore-import-and-ignore', True)
            if 'ignore-content' in settings:
-                content_ignore_spec: Optional[PathSpec] = pathspec.PathSpec.from_lines(
+                content_ignore_spec = pathspec.PathSpec.from_lines(
                    'gitwildmatch', settings['ignore-content']
                )
            if 'ignore-tree-and-content' in settings:
-                tree_and_content_ignore_list.extend(settings.get('ignore-tree-and-content', []))
+                tree_and_content_ignore_list.extend(
                    settings.get('ignore-tree-and-content', [])
                )
    if cli_ignore_patterns:
        tree_and_content_ignore_list.extend(cli_ignore_patterns)
@ -154,6 +183,30 @@ def load_ignore_specs(
    )
    return gitignore_spec, content_ignore_spec, tree_and_content_ignore_spec
 def load_additional_specs(path: str = '.') -> Dict[str, Any]:
    """Load additional specifications from the settings file."""
    additional_specs: Dict[str, Any] = {
        'maximum_word_count_per_file': None
    }
    repo_settings_path = os.path.join(path, '.repo-to-text-settings.yaml')
    if os.path.exists(repo_settings_path):
        logging.debug(
            'Loading .repo-to-text-settings.yaml for additional specs from path: %s',
            repo_settings_path
        )
        with open(repo_settings_path, 'r', encoding='utf-8') as f:
            settings: Dict[str, Any] = yaml.safe_load(f)
            if 'maximum_word_count_per_file' in settings:
                max_words = settings['maximum_word_count_per_file']
                if isinstance(max_words, int) and max_words > 0:
                    additional_specs['maximum_word_count_per_file'] = max_words
                elif max_words is not None: # Allow null/None to mean "not set"
                    logging.warning(
                        "Invalid value for 'maximum_word_count_per_file': %s. "
                        "It must be a positive integer or null. Ignoring.", max_words
                    )
    return additional_specs
 def should_ignore_file(
    file_path: str,
    relative_path: str,
@ -210,61 +263,178 @@ def save_repo_to_text(
        to_stdout: bool = False,
        cli_ignore_patterns: Optional[List[str]] = None
    ) -> str:
-    """Save repository structure and contents to a text file."""
+    """Save repository structure and contents to a text file or multiple files."""
    # pylint: disable=too-many-locals
    logging.debug('Starting to save repo structure to text for path: %s', path)
-    gitignore_spec, content_ignore_spec, tree_and_content_ignore_spec = load_ignore_specs(
+    gitignore_spec, content_ignore_spec, tree_and_content_ignore_spec = (
-        path, cli_ignore_patterns
+        load_ignore_specs(path, cli_ignore_patterns)
    )
    additional_specs = load_additional_specs(path)
    maximum_word_count_per_file = additional_specs.get(
        'maximum_word_count_per_file'
    )
    tree_structure: str = get_tree_structure(
        path, gitignore_spec, tree_and_content_ignore_spec
    )
    logging.debug('Final tree structure to be written: %s', tree_structure)
-    output_content = generate_output_content(
+    output_content_segments = generate_output_content(
        path,
        tree_structure,
        gitignore_spec,
        content_ignore_spec,
-        tree_and_content_ignore_spec
+        tree_and_content_ignore_spec,
        maximum_word_count_per_file
    )
    if to_stdout:
-        print(output_content)
+        for segment in output_content_segments:
-        return output_content
+            print(segment, end='') # Avoid double newlines if segments naturally end with one
        # Return joined content for consistency, though primarily printed
        return "".join(output_content_segments)
-    output_file = write_output_to_file(output_content, output_dir)
+    timestamp = datetime.now(timezone.utc).strftime('%Y-%m-%d-%H-%M-%S-UTC')
-    copy_to_clipboard(output_content)
+    base_output_name_stem = f'repo-to-text_{timestamp}'
-    print(
+    output_filepaths: List[str] = []
-        "[SUCCESS] Repository structure and contents successfully saved to "
+
-        f"file: \"./{output_file}\""
+    if not output_content_segments:
        logging.warning(
            "generate_output_content returned no segments. No output file will be created."
        )
        return "" # Or handle by creating an empty placeholder file
    if len(output_content_segments) == 1:
        single_filename = f"{base_output_name_stem}.txt"
        full_path_single_file = (
            os.path.join(output_dir, single_filename) if output_dir else single_filename
        )
-    return output_file
+        if output_dir and not os.path.exists(output_dir):
            os.makedirs(output_dir)
        with open(full_path_single_file, 'w', encoding='utf-8') as f:
            f.write(output_content_segments[0])
        output_filepaths.append(full_path_single_file)
        copy_to_clipboard(output_content_segments[0])
        # Use basename for safe display in case relpath fails
        display_path = os.path.basename(full_path_single_file)
        print(
            "[SUCCESS] Repository structure and contents successfully saved to "
            f"file: \"{display_path}\""
        )
    else: # Multiple segments
        if output_dir and not os.path.exists(output_dir):
            os.makedirs(output_dir) # Create output_dir once if needed
        for i, segment_content in enumerate(output_content_segments):
            part_filename = f"{base_output_name_stem}_part_{i+1}.txt"
            full_path_part_file = (
                os.path.join(output_dir, part_filename) if output_dir else part_filename
            )
            with open(full_path_part_file, 'w', encoding='utf-8') as f:
                f.write(segment_content)
            output_filepaths.append(full_path_part_file)
        print(
            f"[SUCCESS] Repository structure and contents successfully saved to "
            f"{len(output_filepaths)} files:"
        )
        for fp in output_filepaths:
            # Use basename for safe display in case relpath fails
            display_path = os.path.basename(fp)
            print(f"  - \"{display_path}\"")
    if output_filepaths:
        # Return the actual file path for existence checks
        return output_filepaths[0]
    return ""
 def _read_file_content(file_path: str) -> str:
    """Read file content, handling binary files and broken symlinks.
    Args:
        file_path: Path to the file to read
    Returns:
        str: File content or appropriate message for special cases
    """
    try:
        with open(file_path, 'r', encoding='utf-8') as f:
            return f.read()
    except UnicodeDecodeError:
        logging.debug('Handling binary file contents: %s', file_path)
        with open(file_path, 'rb') as f_bin:
            binary_content: bytes = f_bin.read()
        return binary_content.decode('latin1')
    except FileNotFoundError as e:
        # Minimal handling for bad symlinks
        if os.path.islink(file_path) and not os.path.exists(file_path):
            try:
                target = os.readlink(file_path)
            except OSError:
                target = ''
            return f"[symlink] -> {target}"
        raise e
 def generate_output_content(
        path: str,
        tree_structure: str,
        gitignore_spec: Optional[PathSpec],
        content_ignore_spec: Optional[PathSpec],
-        tree_and_content_ignore_spec: Optional[PathSpec]
+        tree_and_content_ignore_spec: Optional[PathSpec],
-    ) -> str:
+        maximum_word_count_per_file: Optional[int] = None
-    """Generate the output content for the repository."""
+    ) -> List[str]:
-    output_content: List[str] = []
+    """Generate the output content for the repository, potentially split into segments."""
    # pylint: disable=too-many-arguments
    # pylint: disable=too-many-locals
    # pylint: disable=too-many-positional-arguments
    output_segments: List[str] = []
    current_segment_builder: List[str] = []
    current_segment_word_count: int = 0
    project_name = os.path.basename(os.path.abspath(path))
-    # Add XML opening tag
+    def count_words(text: str) -> int:
-    output_content.append('<repo-to-text>\n')
+        return len(text.split())
-    output_content.append(f'Directory: {project_name}\n\n')
+    def _finalize_current_segment():
-    output_content.append('Directory Structure:\n')
+        nonlocal current_segment_word_count # Allow modification
-    output_content.append('<directory_structure>\n.\n')
+        if current_segment_builder:
            output_segments.append("".join(current_segment_builder))
            current_segment_builder.clear()
            current_segment_word_count = 0
    def _add_chunk_to_output(chunk: str):
        nonlocal current_segment_word_count
        chunk_wc = count_words(chunk)
        if maximum_word_count_per_file is not None:
            # If current segment is not empty, and adding this chunk would exceed limit,
            # finalize the current segment before adding this new chunk.
            if (current_segment_builder and 
                current_segment_word_count + chunk_wc > maximum_word_count_per_file):
                _finalize_current_segment()
        current_segment_builder.append(chunk)
        current_segment_word_count += chunk_wc
        # This logic ensures that if a single chunk itself is larger than the limit,
        # it forms its own segment. The next call to _add_chunk_to_output
        # or the final _finalize_current_segment will commit it.
    _add_chunk_to_output('<repo-to-text>\n')
    _add_chunk_to_output(f'Directory: {project_name}\n\n')
    _add_chunk_to_output('Directory Structure:\n')
    _add_chunk_to_output('<directory_structure>\n.\n')
    if os.path.exists(os.path.join(path, '.gitignore')):
-        output_content.append('├── .gitignore\n')
+        _add_chunk_to_output('├── .gitignore\n')
-    output_content.append(tree_structure + '\n' + '</directory_structure>\n')
+    _add_chunk_to_output(tree_structure + '\n' + '</directory_structure>\n')
-    logging.debug('Tree structure written to output content')
+    logging.debug('Tree structure added to output content segment builder')
    for root, _, files in os.walk(path):
        for filename in files:
@ -280,45 +450,42 @@ def generate_output_content(
            ):
                continue
-            relative_path = relative_path.replace('./', '', 1)
+            cleaned_relative_path = relative_path.replace('./', '', 1)
-            try:
+            _add_chunk_to_output(f'\n<content full_path="{cleaned_relative_path}">\n')
-                # Try to open as text first
+            file_content = _read_file_content(file_path)
-                with open(file_path, 'r', encoding='utf-8') as f:
+            _add_chunk_to_output(file_content)
-                    file_content = f.read()
+            _add_chunk_to_output('\n</content>\n')
                    output_content.append(f'\n<content full_path="{relative_path}">\n')
                    output_content.append(file_content)
                    output_content.append('\n</content>\n')
            except UnicodeDecodeError:
                # Handle binary files with the same content tag format
                logging.debug('Handling binary file contents: %s', file_path)
                with open(file_path, 'rb') as f:
                    binary_content = f.read()
                    output_content.append(f'\n<content full_path="{relative_path}">\n')
                    output_content.append(binary_content.decode('latin1'))
                    output_content.append('\n</content>\n')
-    # Add XML closing tag
+    _add_chunk_to_output('\n</repo-to-text>\n')
    output_content.append('\n</repo-to-text>\n')
-    logging.debug('Repository contents written to output content')
+    _finalize_current_segment() # Finalize any remaining content in the builder
-    return ''.join(output_content)
+    logging.debug(
        'Repository contents generated into %s segment(s)', len(output_segments)
    )
-def write_output_to_file(output_content: str, output_dir: Optional[str]) -> str:
+    # Ensure at least one segment is returned, even if it's just the empty repo structure
-    """Write the output content to a file."""
+    if not output_segments and not current_segment_builder:
-    timestamp = datetime.now(timezone.utc).strftime('%Y-%m-%d-%H-%M-%S-UTC')
+        # This case implies an empty repo and an extremely small word limit that split
-    output_file = f'repo-to-text_{timestamp}.txt'
+        # even the minimal tags. Or, if all content was filtered out.
        # Return a minimal valid structure if everything else resulted in empty.
        # However, the _add_chunk_to_output for repo tags should ensure
        # current_segment_builder is not empty. And _finalize_current_segment ensures
        # output_segments gets it. If output_segments is truly empty, it means an error
        # or unexpected state. For safety, if it's empty, return a list with one empty
        # string or minimal tags. Given the logic, this path is unlikely.
        logging.warning(
            "No output segments were generated. Returning a single empty segment."
        )
        return ["<repo-to-text>\n</repo-to-text>\n"]
    if output_dir:
        if not os.path.exists(output_dir):
            os.makedirs(output_dir)
        output_file = os.path.join(output_dir, output_file)
-    with open(output_file, 'w', encoding='utf-8') as file:
+    return output_segments
        file.write(output_content)
-    return output_file
+
 # The original write_output_to_file function is no longer needed as its logic
 # is incorporated into save_repo_to_text for handling single/multiple files.
 def copy_to_clipboard(output_content: str) -> None:
    """Copy the output content to the clipboard if possible."""
--- a/tests/test_core.py
+++ b/tests/test_core.py
@ -3,15 +3,20 @@
 import os
 import tempfile
 import shutil
-from typing import Generator
+from typing import Generator, IO
 import pytest
 from unittest.mock import patch, mock_open, MagicMock
 import yaml # For creating mock settings files easily
 from repo_to_text.core.core import (
    get_tree_structure,
    load_ignore_specs,
    should_ignore_file,
    is_ignored_path,
-    save_repo_to_text
+    save_repo_to_text,
    load_additional_specs,
    generate_output_content
 )
 # pylint: disable=redefined-outer-name
@ -23,6 +28,47 @@ def temp_dir() -> Generator[str, None, None]:
    yield temp_path
    shutil.rmtree(temp_path)
 # Mock tree outputs
 # Raw output similar to `tree -a -f --noreport`
 MOCK_RAW_TREE_FOR_SAMPLE_REPO = """./
 ./.gitignore
 ./.repo-to-text-settings.yaml
 ./README.md
 ./src
 ./src/main.py
 ./tests
 ./tests/test_main.py
 """
 MOCK_RAW_TREE_SPECIAL_CHARS = """./
 ./special chars
 ./special chars/file with spaces.txt
 """
 MOCK_RAW_TREE_EMPTY_FILTERING = """./
 ./src
 ./src/main.py
 ./tests
 ./tests/test_main.py
 """
 # Note: ./empty_dir is removed, assuming tree or filter_tree_output would handle it.
 # This makes the test focus on the rest of the logic if tree output is as expected.
 # Expected output from get_tree_structure (filtered)
 MOCK_GTS_OUTPUT_FOR_SAMPLE_REPO = """.
 ├── .gitignore
 ├── README.md
 ├── src
 │   └── main.py
 └── tests
    └── test_main.py"""
 MOCK_GTS_OUTPUT_FOR_SIMPLE_REPO = """.
 ├── file1.txt
 ├── file2.txt
 └── subdir
    └── file3.txt"""
@pytest.fixture
 def sample_repo(tmp_path: str) -> str:
    """Create a sample repository structure for testing."""
@ -60,6 +106,26 @@ ignore-content:
    return tmp_path_str
@pytest.fixture
 def simple_word_count_repo(tmp_path: str) -> str:
    """Create a simple repository for word count testing."""
    repo_path = str(tmp_path)
    files_content = {
        "file1.txt": "This is file one. It has eight words.", # 8 words
        "file2.txt": "File two is here. This makes six words.", # 6 words
        "subdir/file3.txt": "Another file in a subdirectory, with ten words exactly." # 10 words
    }
    for file_path, content in files_content.items():
        full_path = os.path.join(repo_path, file_path)
        os.makedirs(os.path.dirname(full_path), exist_ok=True)
        with open(full_path, "w", encoding="utf-8") as f:
            f.write(content)
    return repo_path
 def count_words_for_test(text: str) -> int:
    """Helper to count words consistently with core logic for tests."""
    return len(text.split())
 def test_is_ignored_path() -> None:
    """Test the is_ignored_path function."""
    assert is_ignored_path(".git/config") is True
@ -111,9 +177,12 @@ def test_should_ignore_file(sample_repo: str) -> None:
        tree_and_content_ignore_spec
    ) is False
-def test_get_tree_structure(sample_repo: str) -> None:
+@patch('repo_to_text.core.core.run_tree_command', return_value=MOCK_RAW_TREE_FOR_SAMPLE_REPO)
@patch('repo_to_text.core.core.check_tree_command', return_value=True)
 def test_get_tree_structure(mock_check_tree: MagicMock, mock_run_tree: MagicMock, sample_repo: str) -> None:
    """Test tree structure generation."""
    gitignore_spec, _, tree_and_content_ignore_spec = load_ignore_specs(sample_repo)
    # The .repo-to-text-settings.yaml in sample_repo ignores itself from tree and content
    tree_output = get_tree_structure(sample_repo, gitignore_spec, tree_and_content_ignore_spec)
    # Basic structure checks
@ -122,8 +191,11 @@ def test_get_tree_structure(sample_repo: str) -> None:
    assert "main.py" in tree_output
    assert "test_main.py" in tree_output
    assert ".git" not in tree_output
    assert ".repo-to-text-settings.yaml" not in tree_output # Should be filtered by tree_and_content_ignore_spec
-def test_save_repo_to_text(sample_repo: str) -> None:
+@patch('repo_to_text.core.core.get_tree_structure', return_value=MOCK_GTS_OUTPUT_FOR_SAMPLE_REPO)
@patch('repo_to_text.core.core.check_tree_command', return_value=True) # In case any internal call still checks
 def test_save_repo_to_text(mock_check_tree: MagicMock, mock_get_tree: MagicMock, sample_repo: str) -> None:
    """Test the main save_repo_to_text function."""
    # Create output directory
    output_dir = os.path.join(sample_repo, "output")
@ -137,7 +209,7 @@ def test_save_repo_to_text(sample_repo: str) -> None:
    # Test file output
    output_file = save_repo_to_text(sample_repo, output_dir=output_dir)
    assert os.path.exists(output_file)
-    assert os.path.dirname(output_file) == output_dir
+    assert os.path.abspath(os.path.dirname(output_file)) == os.path.abspath(output_dir)
    # Check file contents
    with open(output_file, 'r', encoding='utf-8') as f:
@ -189,15 +261,20 @@ def test_load_ignore_specs_without_gitignore(temp_dir: str) -> None:
    assert content_ignore_spec is None
    assert tree_and_content_ignore_spec is not None
-def test_get_tree_structure_with_special_chars(temp_dir: str) -> None:
+@patch('repo_to_text.core.core.run_tree_command', return_value=MOCK_RAW_TREE_SPECIAL_CHARS)
@patch('repo_to_text.core.core.check_tree_command', return_value=True)
 def test_get_tree_structure_with_special_chars(mock_check_tree: MagicMock, mock_run_tree: MagicMock, temp_dir: str) -> None:
    """Test tree structure generation with special characters in paths."""
    # Create files with special characters
-    special_dir = os.path.join(temp_dir, "special chars")
+    special_dir = os.path.join(temp_dir, "special chars") # Matches MOCK_RAW_TREE_SPECIAL_CHARS
    os.makedirs(special_dir)
    with open(os.path.join(special_dir, "file with spaces.txt"), "w", encoding='utf-8') as f:
        f.write("test")
-    tree_output = get_tree_structure(temp_dir)
+    # load_ignore_specs will be called inside; for temp_dir, they will be None or empty.
    gitignore_spec, _, tree_and_content_ignore_spec = load_ignore_specs(temp_dir)
    tree_output = get_tree_structure(temp_dir, gitignore_spec, tree_and_content_ignore_spec)
    assert "special chars" in tree_output
    assert "file with spaces.txt" in tree_output
@ -243,7 +320,9 @@ def test_save_repo_to_text_with_binary_files(temp_dir: str) -> None:
    expected_content = f"<content full_path=\"binary.bin\">\n{binary_content.decode('latin1')}\n</content>"
    assert expected_content in output
-def test_save_repo_to_text_custom_output_dir(temp_dir: str) -> None:
+@patch('repo_to_text.core.core.get_tree_structure', return_value=MOCK_GTS_OUTPUT_FOR_SIMPLE_REPO) # Using simple repo tree for generic content
@patch('repo_to_text.core.core.check_tree_command', return_value=True)
 def test_save_repo_to_text_custom_output_dir(mock_check_tree: MagicMock, mock_get_tree: MagicMock, temp_dir: str) -> None:
    """Test save_repo_to_text with custom output directory."""
    # Create a simple file structure
    with open(os.path.join(temp_dir, "test.txt"), "w", encoding='utf-8') as f:
@ -254,8 +333,10 @@ def test_save_repo_to_text_custom_output_dir(temp_dir: str) -> None:
    output_file = save_repo_to_text(temp_dir, output_dir=output_dir)
    assert os.path.exists(output_file)
-    assert os.path.dirname(output_file) == output_dir
+    assert os.path.abspath(os.path.dirname(output_file)) == os.path.abspath(output_dir)
-    assert output_file.startswith(output_dir)
+    # output_file is relative, output_dir is absolute. This assertion needs care.
    # Let's assert that the absolute path of output_file starts with absolute output_dir
    assert os.path.abspath(output_file).startswith(os.path.abspath(output_dir))
 def test_get_tree_structure_empty_directory(temp_dir: str) -> None:
    """Test tree structure generation for empty directory."""
@ -263,7 +344,9 @@ def test_get_tree_structure_empty_directory(temp_dir: str) -> None:
    # Should only contain the directory itself
    assert tree_output.strip() == "" or tree_output.strip() == temp_dir
-def test_empty_dirs_filtering(tmp_path: str) -> None:
+@patch('repo_to_text.core.core.run_tree_command', return_value=MOCK_RAW_TREE_EMPTY_FILTERING)
@patch('repo_to_text.core.core.check_tree_command', return_value=True)
 def test_empty_dirs_filtering(mock_check_tree: MagicMock, mock_run_tree: MagicMock, tmp_path: str) -> None:
    """Test filtering of empty directories in tree structure generation."""
    # Create test directory structure with normalized paths
    base_path = os.path.normpath(tmp_path)
@ -302,5 +385,364 @@ def test_empty_dirs_filtering(tmp_path: str) -> None:
        # Check that no line contains 'empty_dir'
        assert "empty_dir" not in line, f"Found empty_dir in line: {line}"
 # Tests for maximum_word_count_per_file functionality
 def test_load_additional_specs_valid_max_words(tmp_path: str) -> None:
    """Test load_additional_specs with a valid maximum_word_count_per_file."""
    settings_content = {"maximum_word_count_per_file": 1000}
    settings_file = os.path.join(tmp_path, ".repo-to-text-settings.yaml")
    with open(settings_file, "w", encoding="utf-8") as f:
        yaml.dump(settings_content, f)
    specs = load_additional_specs(tmp_path)
    assert specs["maximum_word_count_per_file"] == 1000
 def test_load_additional_specs_invalid_max_words_string(tmp_path: str, caplog: pytest.LogCaptureFixture) -> None:
    """Test load_additional_specs with an invalid string for maximum_word_count_per_file."""
    settings_content = {"maximum_word_count_per_file": "not-an-integer"}
    settings_file = os.path.join(tmp_path, ".repo-to-text-settings.yaml")
    with open(settings_file, "w", encoding="utf-8") as f:
        yaml.dump(settings_content, f)
    specs = load_additional_specs(tmp_path)
    assert specs["maximum_word_count_per_file"] is None
    assert "Invalid value for 'maximum_word_count_per_file': not-an-integer" in caplog.text
 def test_load_additional_specs_invalid_max_words_negative(tmp_path: str, caplog: pytest.LogCaptureFixture) -> None:
    """Test load_additional_specs with a negative integer for maximum_word_count_per_file."""
    settings_content = {"maximum_word_count_per_file": -100}
    settings_file = os.path.join(tmp_path, ".repo-to-text-settings.yaml")
    with open(settings_file, "w", encoding="utf-8") as f:
        yaml.dump(settings_content, f)
    specs = load_additional_specs(tmp_path)
    assert specs["maximum_word_count_per_file"] is None
    assert "Invalid value for 'maximum_word_count_per_file': -100" in caplog.text
 def test_load_additional_specs_max_words_is_none_in_yaml(tmp_path: str, caplog: pytest.LogCaptureFixture) -> None:
    """Test load_additional_specs when maximum_word_count_per_file is explicitly null in YAML."""
    settings_content = {"maximum_word_count_per_file": None} # In YAML, this is 'null'
    settings_file = os.path.join(tmp_path, ".repo-to-text-settings.yaml")
    with open(settings_file, "w", encoding="utf-8") as f:
        yaml.dump(settings_content, f)
    specs = load_additional_specs(tmp_path)
    assert specs["maximum_word_count_per_file"] is None
    assert "Invalid value for 'maximum_word_count_per_file'" not in caplog.text
 def test_load_additional_specs_max_words_not_present(tmp_path: str) -> None:
    """Test load_additional_specs when maximum_word_count_per_file is not present."""
    settings_content = {"other_setting": "value"}
    settings_file = os.path.join(tmp_path, ".repo-to-text-settings.yaml")
    with open(settings_file, "w", encoding="utf-8") as f:
        yaml.dump(settings_content, f)
    specs = load_additional_specs(tmp_path)
    assert specs["maximum_word_count_per_file"] is None
 def test_load_additional_specs_no_settings_file(tmp_path: str) -> None:
    """Test load_additional_specs when no settings file exists."""
    specs = load_additional_specs(tmp_path)
    assert specs["maximum_word_count_per_file"] is None
 # Tests for generate_output_content related to splitting
@patch('repo_to_text.core.core.get_tree_structure', return_value=MOCK_GTS_OUTPUT_FOR_SIMPLE_REPO)
 def test_generate_output_content_no_splitting_max_words_not_set(mock_get_tree: MagicMock, simple_word_count_repo: str) -> None:
    """Test generate_output_content with no splitting when max_words is not set."""
    path = simple_word_count_repo
    gitignore_spec, content_ignore_spec, tree_and_content_ignore_spec = load_ignore_specs(path)
    # tree_structure is now effectively MOCK_GTS_OUTPUT_FOR_SIMPLE_REPO due to the mock
    segments = generate_output_content(
        path, MOCK_GTS_OUTPUT_FOR_SIMPLE_REPO, gitignore_spec, content_ignore_spec, tree_and_content_ignore_spec,
        maximum_word_count_per_file=None
    )
    mock_get_tree.assert_not_called() # We are passing tree_structure directly
    assert len(segments) == 1
    assert "file1.txt" in segments[0]
    assert "This is file one." in segments[0]
@patch('repo_to_text.core.core.get_tree_structure', return_value=MOCK_GTS_OUTPUT_FOR_SIMPLE_REPO)
 def test_generate_output_content_no_splitting_content_less_than_limit(mock_get_tree: MagicMock, simple_word_count_repo: str) -> None:
    """Test generate_output_content with no splitting when content is less than max_words limit."""
    path = simple_word_count_repo
    gitignore_spec, content_ignore_spec, tree_and_content_ignore_spec = load_ignore_specs(path)
    segments = generate_output_content(
        path, MOCK_GTS_OUTPUT_FOR_SIMPLE_REPO, gitignore_spec, content_ignore_spec, tree_and_content_ignore_spec,
        maximum_word_count_per_file=500 # High limit
    )
    mock_get_tree.assert_not_called()
    assert len(segments) == 1
    assert "file1.txt" in segments[0]
@patch('repo_to_text.core.core.get_tree_structure', return_value=MOCK_GTS_OUTPUT_FOR_SIMPLE_REPO)
 def test_generate_output_content_splitting_occurs(mock_get_tree: MagicMock, simple_word_count_repo: str) -> None:
    """Test generate_output_content when splitting occurs due to max_words limit."""
    path = simple_word_count_repo
    gitignore_spec, content_ignore_spec, tree_and_content_ignore_spec = load_ignore_specs(path)
    max_words = 30
    segments = generate_output_content(
        path, MOCK_GTS_OUTPUT_FOR_SIMPLE_REPO, gitignore_spec, content_ignore_spec, tree_and_content_ignore_spec,
        maximum_word_count_per_file=max_words
    )
    mock_get_tree.assert_not_called()
    assert len(segments) > 1
    total_content = "".join(segments)
    assert "file1.txt" in total_content
    assert "This is file one." in total_content
    for i, segment in enumerate(segments):
        segment_word_count = count_words_for_test(segment)
        if i < len(segments) - 1: # For all but the last segment
             # A segment can be larger than max_words if a single chunk (e.g. file content block) is larger
             assert segment_word_count <= max_words or \
                    (segment_word_count > max_words and count_words_for_test(segment.splitlines()[-2]) > max_words)
        else: # Last segment can be smaller
             assert segment_word_count > 0
@patch('repo_to_text.core.core.get_tree_structure', return_value=MOCK_GTS_OUTPUT_FOR_SIMPLE_REPO)
 def test_generate_output_content_splitting_very_small_limit(mock_get_tree: MagicMock, simple_word_count_repo: str) -> None:
    """Test generate_output_content with a very small max_words limit."""
    path = simple_word_count_repo
    gitignore_spec, content_ignore_spec, tree_and_content_ignore_spec = load_ignore_specs(path)
    max_words = 10 # Very small limit
    segments = generate_output_content(
        path, MOCK_GTS_OUTPUT_FOR_SIMPLE_REPO, gitignore_spec, content_ignore_spec, tree_and_content_ignore_spec,
        maximum_word_count_per_file=max_words
    )
    mock_get_tree.assert_not_called()
    assert len(segments) > 3 # Expect multiple splits due to small limit and multiple chunks
    total_content = "".join(segments)
    assert "file1.txt" in total_content # Check presence of file name in overall output
    raw_file1_content = "This is file one. It has eight words." # 8 words
    # Based on actual debug output, the closing tag is just "</content>" (1 word)
    closing_tag_content = "</content>" # 1 word
    # With max_words = 10:
    # The splitting logic works per chunk, so raw_content (8 words) + closing_tag (1 word) = 9 words total
    # should fit in one segment when they're placed together
    # Debug: Let's see what segments actually look like in CI
    print(f"\nDEBUG: Generated {len(segments)} segments:")
    for i, segment in enumerate(segments):
        print(f"Segment {i+1} ({count_words_for_test(segment)} words):")
        print(f"'{segment}'")
        print("---")
    found_raw_content_segment = False
    for segment in segments:
        if raw_file1_content in segment:
            # Check if this segment contains raw content with closing tag (total 9 words)
            segment_wc = count_words_for_test(segment)
            if closing_tag_content in segment:
                # Raw content (8 words) + closing tag (1 word) = 9 words total
                expected_word_count = count_words_for_test(raw_file1_content) + count_words_for_test(closing_tag_content)
                assert segment_wc == expected_word_count # Should be 9 words
                found_raw_content_segment = True
                break
            else:
                # Segment contains opening tag + raw content (2 + 8 = 10 words)
                # Opening tag: <content full_path="file1.txt"> (2 words)
                # Raw content: "This is file one. It has eight words." (8 words)
                opening_tag_word_count = 2  # <content and full_path="file1.txt">
                expected_word_count = opening_tag_word_count + count_words_for_test(raw_file1_content)
                assert segment_wc == expected_word_count # Should be 10 words
                found_raw_content_segment = True
                break
    assert found_raw_content_segment, "Segment with raw file1 content not found or not matching expected structure"
@patch('repo_to_text.core.core.get_tree_structure') # Will use a specific mock inside
 def test_generate_output_content_file_header_content_together(mock_get_tree: MagicMock, tmp_path: str) -> None:
    """Test that file header and its content are not split if word count allows."""
    repo_path = str(tmp_path)
    file_content_str = "word " * 15 # 15 words
    # Tags: <content full_path="single_file.txt">\n (3) + \n</content> (2) = 5 words. Total block = 20 words.
    files_content = {"single_file.txt": file_content_str.strip()}
    for file_path_key, content_val in files_content.items():
        full_path = os.path.join(repo_path, file_path_key)
        os.makedirs(os.path.dirname(full_path), exist_ok=True)
        with open(full_path, "w", encoding="utf-8") as f:
            f.write(content_val)
    gitignore_spec, content_ignore_spec, tree_and_content_ignore_spec = load_ignore_specs(repo_path)
    # Mock the tree structure for this specific test case
    mock_tree_for_single_file = ".\n└── single_file.txt"
    mock_get_tree.return_value = mock_tree_for_single_file # This mock is for any internal calls if any
    max_words_sufficient = 35 # Enough for header + this one file block (around 20 words + initial header)
    segments = generate_output_content(
        repo_path, mock_tree_for_single_file, gitignore_spec, content_ignore_spec, tree_and_content_ignore_spec,
        maximum_word_count_per_file=max_words_sufficient
    )
    assert len(segments) == 1 # Expect no splitting of this file from its tags
    expected_file_block = f'<content full_path="single_file.txt">\n{file_content_str.strip()}\n</content>'
    assert expected_file_block in segments[0]
    # Test if it splits if max_words is too small for the file block (20 words)
    max_words_small = 10
    segments_small_limit = generate_output_content(
        repo_path, mock_tree_for_single_file, gitignore_spec, content_ignore_spec, tree_and_content_ignore_spec,
        maximum_word_count_per_file=max_words_small
    )
    # The file block (20 words) is a single chunk. It will form its own segment.
    # Header part will be one segment. File block another. Footer another.
    assert len(segments_small_limit) >= 2
    found_raw_content_in_own_segment = False
    raw_content_single_file = "word " * 15 # 15 words
    # expected_file_block is the whole thing (20 words)
    # With max_words_small = 10:
    # 1. Opening tag (3 words) -> new segment
    # 2. Raw content (15 words) -> new segment (because 0 + 15 > 10)
    # 3. Closing tag (2 words) -> new segment (because 0 + 2 <= 10, but follows a large chunk)
    for segment in segments_small_limit:
        if raw_content_single_file.strip() in segment.strip() and \
           '<content full_path="single_file.txt">' not in segment and \
           '</content>' not in segment:
            # This segment should contain only the raw 15 words
            assert count_words_for_test(segment.strip()) == 15
            found_raw_content_in_own_segment = True
            break
    assert found_raw_content_in_own_segment, "Raw content of single_file.txt not found in its own segment"
 # Tests for save_repo_to_text related to splitting
@patch('repo_to_text.core.core.load_additional_specs')
@patch('repo_to_text.core.core.generate_output_content')
@patch('repo_to_text.core.core.os.makedirs')
@patch('builtins.open', new_callable=mock_open)
@patch('repo_to_text.core.core.copy_to_clipboard')
 def test_save_repo_to_text_no_splitting_mocked(
    mock_copy_to_clipboard: MagicMock,
    mock_file_open: MagicMock,
    mock_makedirs: MagicMock,
    mock_generate_output: MagicMock,
    mock_load_specs: MagicMock,
    simple_word_count_repo: str,
    tmp_path: str
 ) -> None:
    """Test save_repo_to_text: no splitting, single file output."""
    mock_load_specs.return_value = {'maximum_word_count_per_file': None}
    mock_generate_output.return_value = ["Single combined content\nfile1.txt\ncontent1"]
    output_dir = os.path.join(str(tmp_path), "output")
    with patch('repo_to_text.core.core.datetime') as mock_datetime:
        mock_datetime.now.return_value.strftime.return_value = "mock_timestamp"
        returned_path = save_repo_to_text(simple_word_count_repo, output_dir=output_dir)
    mock_load_specs.assert_called_once_with(simple_word_count_repo)
    mock_generate_output.assert_called_once()
    expected_filename = os.path.join(output_dir, "repo-to-text_mock_timestamp.txt")
    assert os.path.basename(returned_path) == os.path.basename(expected_filename)
    mock_makedirs.assert_called_once_with(output_dir)
    mock_file_open.assert_called_once_with(expected_filename, 'w', encoding='utf-8')
    mock_file_open().write.assert_called_once_with("Single combined content\nfile1.txt\ncontent1")
    mock_copy_to_clipboard.assert_called_once_with("Single combined content\nfile1.txt\ncontent1")
@patch('repo_to_text.core.core.load_additional_specs')
@patch('repo_to_text.core.core.generate_output_content')
@patch('repo_to_text.core.core.os.makedirs')
@patch('builtins.open')
@patch('repo_to_text.core.core.copy_to_clipboard')
 def test_save_repo_to_text_splitting_occurs_mocked(
    mock_copy_to_clipboard: MagicMock,
    mock_open_function: MagicMock,
    mock_makedirs: MagicMock,
    mock_generate_output: MagicMock,
    mock_load_specs: MagicMock,
    simple_word_count_repo: str,
    tmp_path: str
 ) -> None:
    """Test save_repo_to_text: splitting occurs, multiple file outputs with better write check."""
    mock_load_specs.return_value = {'maximum_word_count_per_file': 50}
    segments_content = ["Segment 1 content data", "Segment 2 content data"]
    mock_generate_output.return_value = segments_content
    output_dir = os.path.join(str(tmp_path), "output_split_adv")
    mock_file_handle1 = MagicMock(spec=IO)
    mock_file_handle2 = MagicMock(spec=IO)
    mock_open_function.side_effect = [mock_file_handle1, mock_file_handle2]
    with patch('repo_to_text.core.core.datetime') as mock_datetime:
        mock_datetime.now.return_value.strftime.return_value = "mock_ts_split_adv"
        returned_path = save_repo_to_text(simple_word_count_repo, output_dir=output_dir)
    expected_filename_part1 = os.path.join(output_dir, "repo-to-text_mock_ts_split_adv_part_1.txt")
    expected_filename_part2 = os.path.join(output_dir, "repo-to-text_mock_ts_split_adv_part_2.txt")
    assert os.path.basename(returned_path) == os.path.basename(expected_filename_part1)
    mock_makedirs.assert_called_once_with(output_dir)
    mock_open_function.assert_any_call(expected_filename_part1, 'w', encoding='utf-8')
    mock_open_function.assert_any_call(expected_filename_part2, 'w', encoding='utf-8')
    assert mock_open_function.call_count == 2
    mock_file_handle1.__enter__().write.assert_called_once_with(segments_content[0])
    mock_file_handle2.__enter__().write.assert_called_once_with(segments_content[1])
    mock_copy_to_clipboard.assert_not_called()
@patch('repo_to_text.core.core.copy_to_clipboard')
@patch('builtins.open', new_callable=mock_open)
@patch('repo_to_text.core.core.os.makedirs')
@patch('repo_to_text.core.core.generate_output_content') # This is the one that will be used
@patch('repo_to_text.core.core.load_additional_specs')   # This is the one that will be used
@patch('repo_to_text.core.core.get_tree_structure', return_value=MOCK_GTS_OUTPUT_FOR_SIMPLE_REPO)
 def test_save_repo_to_text_stdout_with_splitting(
    mock_get_tree: MagicMock,         # Order of mock args should match decorator order (bottom-up)
    mock_load_specs: MagicMock,
    mock_generate_output: MagicMock,
    mock_os_makedirs: MagicMock,
    mock_file_open: MagicMock,
    mock_copy_to_clipboard: MagicMock,
    simple_word_count_repo: str,
    capsys: pytest.CaptureFixture[str]
 ) -> None:
    """Test save_repo_to_text with to_stdout=True and content that would split."""
    mock_load_specs.return_value = {'maximum_word_count_per_file': 10}
    mock_generate_output.return_value = ["Segment 1 for stdout.", "Segment 2 for stdout."]
    result_string = save_repo_to_text(simple_word_count_repo, to_stdout=True)
    mock_load_specs.assert_called_once_with(simple_word_count_repo)
    mock_get_tree.assert_called_once() # Assert that get_tree_structure was called
    mock_generate_output.assert_called_once()
    mock_os_makedirs.assert_not_called()
    mock_file_open.assert_not_called()
    mock_copy_to_clipboard.assert_not_called()
    captured = capsys.readouterr()
    assert "Segment 1 for stdout.Segment 2 for stdout." == captured.out.strip() # Added strip() to handle potential newlines from logging
    assert result_string == "Segment 1 for stdout.Segment 2 for stdout."
@patch('repo_to_text.core.core.load_additional_specs')
@patch('repo_to_text.core.core.generate_output_content')
@patch('repo_to_text.core.core.os.makedirs')
@patch('builtins.open', new_callable=mock_open)
@patch('repo_to_text.core.core.copy_to_clipboard')
 def test_save_repo_to_text_empty_segments(
    mock_copy_to_clipboard: MagicMock,
    mock_file_open: MagicMock,
    mock_makedirs: MagicMock,
    mock_generate_output: MagicMock,
    mock_load_specs: MagicMock,
    simple_word_count_repo: str,
    tmp_path: str,
    caplog: pytest.LogCaptureFixture
 ) -> None:
    """Test save_repo_to_text when generate_output_content returns no segments."""
    mock_load_specs.return_value = {'maximum_word_count_per_file': None}
    mock_generate_output.return_value = []
    output_dir = os.path.join(str(tmp_path), "output_empty")
    returned_path = save_repo_to_text(simple_word_count_repo, output_dir=output_dir)
    assert returned_path == ""
    mock_makedirs.assert_not_called()
    mock_file_open.assert_not_called()
    mock_copy_to_clipboard.assert_not_called()
    assert "generate_output_content returned no segments" in caplog.text
 if __name__ == "__main__":
    pytest.main([__file__])
Author	SHA1	Message	Date
Luke Craig	77209f30aa	Add minimal handling for broken symlinks in generate_output_content (#32 ) Some checks failed Run Tests / test (3.11) (push) Has been cancelled Details Run Tests / test (3.13) (push) Has been cancelled Details Run Tests / test (3.9) (push) Has been cancelled Details * Add minimal handling for broken symlinks in generate_output_content * core: simplify generate_output_content * pylint adjust no-else-return	2025-10-28 09:27:44 +01:00
Kirill Markin	8a94182b3d	Bump version to 0.8.0 Some checks failed Run Tests / test (3.11) (push) Has been cancelled Details Run Tests / test (3.13) (push) Has been cancelled Details Run Tests / test (3.9) (push) Has been cancelled Details	2025-10-25 15:33:35 +02:00
Kirill Markin	bcb0d82191	refactor: reorganize cursor rules into .cursor directory - Move cursor rules from .cursorrules to .cursor/index.mdc - Create CLAUDE.md and AGENTS.md symlinks in project root - Delete deprecated .cursorrules file - Symlinks point to .cursor/index.mdc for consistent rule management	2025-10-25 15:11:43 +02:00
Kirill Markin	2807344752	Merge pull request #35 from kirill-markin/fix-issue-26-windows-tree-command Fix tree command for Windows	2025-10-25 15:02:46 +02:00
Kirill Markin	3721ed45f0	Fix tree command for Windows (fixes #26 ) - Add platform detection to run_tree_command - Use 'cmd /c tree /a /f' syntax on Windows - Keep 'tree -a -f --noreport' syntax on Unix/Linux/Mac - Modernize subprocess call with text=True and encoding='utf-8' - Add stderr=subprocess.PIPE for better error handling All 43 tests pass successfully.	2025-10-25 15:02:18 +02:00
Kirill Markin	de1c84eca3	Fix test assertion for content splitting logic - Corrected test_generate_output_content_splitting_very_small_limit to expect 10 words instead of 8 - The test now properly accounts for opening tag (2 words) + raw content (8 words) in the same segment - Reflects actual behavior where opening tag and content are grouped together when they fit within word limit	2025-05-25 11:20:34 +03:00
Kirill Markin	0ace858645	Add debug output to understand CI test failure	2025-05-25 11:14:12 +03:00
Kirill Markin	44153cde98	Fix failing test: test_generate_output_content_splitting_very_small_limit - Corrected word count expectations for closing XML tag - Fixed test logic to match actual output segment structure - The closing tag '</content>' is 1 word, not 2 as previously assumed - All 43 tests now pass successfully	2025-05-25 11:12:48 +03:00
Kirill Markin	b04dd8df63	Fix pylint logging-fstring-interpolation warning - Replace f-string with lazy % formatting in logging.debug() call - Resolves W1203 pylint warning for better logging performance - Achieves 10.00/10 pylint rating	2025-05-25 11:07:43 +03:00
Kirill Markin	14d2b3b36e	Fix GitHub Actions tests - Remove Python 3.8 from test matrix (incompatible with requires-python >=3.9) - Add proper type annotations for pytest fixtures (capsys, caplog)	2025-05-25 11:05:22 +03:00
Kirill Markin	689dd362ec	Update test functions to include explicit type annotations for caplog - Modify test_load_additional_specs_invalid_max_words_string, test_load_additional_specs_invalid_max_words_negative, and test_load_additional_specs_max_words_is_none_in_yaml to specify caplog as pytest.LogCaptureFixture. - Update test_save_repo_to_text_stdout_with_splitting and test_save_repo_to_text_empty_segments to annotate capsys and caplog respectively for improved type safety and clarity.	2025-05-25 11:03:20 +03:00
Kirill Markin	57026bd52e	Enhance error handling in process_line and update display path in save_repo_to_text - Add fallback logic for os.path.relpath in process_line to handle cases where it fails, ensuring robust path resolution. - Update save_repo_to_text to use basename for displaying file paths, improving clarity in success messages and output. - Modify tests to assert on basename instead of relative path, aligning with the new display logic.	2025-05-25 11:02:06 +03:00
Kirill Markin	241ce0ef70	Fix CI: Enable dev dependencies for pylint - Uncomment [project.optional-dependencies] dev section - Remove duplicate Poetry dev dependencies - Fix pylint command not found error in GitHub Actions - Resolves CI failure in PR #28	2025-05-25 10:53:11 +03:00
Kirill Markin	d7badce9ae	Merge pull request #28 from kirill-markin/feature/word-count-splitting Add support for splitting text by maximum word count	2025-05-25 10:49:35 +03:00
Kirill Markin	3731c01a20	Refactor logging statements in core.py for improved readability - Split long logging messages into multiple lines for better clarity - Ensure consistent formatting across logging calls - Minor adjustments to maintain code readability	2025-05-25 10:48:38 +03:00
Kirill Markin	7a60741471	Remove unused IO import from core.py	2025-05-25 10:48:37 +03:00
Kirill Markin	5c5b0ab941	Bump version to 0.7.0 for word count splitting feature	2025-05-25 10:48:36 +03:00
Zhan Li	34aa48c0a1	address test errors	2025-05-25 00:33:35 -07:00
Zhan Li	e066b481af	add support for splitted text by maximum word count	2025-05-25 00:11:54 -07:00