Przeglądaj źródła

Sujith :) ->
1. Added logs

Sujith:) 11 godzin temu
rodzic
commit
e3fa853059

+ 4 - 4
Assets/LLM/source/Documentation/1_HighLevel_Data_Guide.md

@@ -1,4 +1,4 @@
-# LLM Guide: High-Level Project Data (`HighLevel.zip`)
+# LLM Guide: High-Level Project Data (`HighLevel/`)
 
 ## 1. Purpose
 
@@ -14,7 +14,7 @@ This data level provides a **high-level, bird's-eye view** of the entire Unity p
 
 ## 2. File Structure
 
-The zip file will contain the following structure:
+The `HighLevel` folder will contain the following structure:
 
 ```
 HighLevel/
@@ -85,11 +85,11 @@ This file is a direct copy of the project's `Packages/manifest.json`. It lists a
 
 ## 4. Handling Shrunken / Tokenized Data
 
-The JSON files in this zip may have been processed by `json_reducer.py` to save space. You can identify this if the root object contains a `key_mapper` and a `data` key.
+The JSON files in this folder may have been processed by `json_reducer.py` to save space. You can identify this if the root object contains a `key_mapper` and a `data` key.
 
 **If the data is shrunken:**
 1.  The file content will look like this: `{"key_mapper":{"k0":"productName",...},"data":{"k0":"Terra-LLM",...}}`
 2.  You **MUST** use the `key_mapper` object to de-tokenize the keys in the `data` object before you can interpret the schema. The `key_mapper` is a dictionary where the keys are the short tokens (e.g., `k0`) and the values are the original, human-readable keys (e.g., `productName`).
 3.  Recursively replace every token in the `data` object with its original value from the `key_mapper`.
 
-Once de-tokenized, the data will conform to the schemas described above.
+Once de-tokenized, the data will conform to the schemas described above.

+ 4 - 4
Assets/LLM/source/Documentation/2_MidLevel_Data_Guide.md

@@ -1,4 +1,4 @@
-# LLM Guide: Mid-Level Project Data (`MidLevel.zip`)
+# LLM Guide: Mid-Level Project Data (`MidLevel/`)
 
 ## 1. Purpose
 
@@ -13,7 +13,7 @@ This data level provides a **structural and relational view** of the project's a
 
 ## 2. File Structure
 
-The zip file will contain a replicated `Assets/` directory and a `GuidMappers/` directory.
+The `MidLevel` folder will contain a replicated `Assets/` directory and a `GuidMappers/` directory.
 
 ```
 MidLevel/
@@ -95,11 +95,11 @@ Each `.unity` and `.prefab` file is converted into a single JSON file that descr
 
 ## 4. Handling Shrunken / Tokenized Data
 
-The JSON files in this zip may have been processed by `json_reducer.py`. You can identify this if the root object contains a `key_mapper` and a `data` key.
+The JSON files in this folder may have been processed by `json_reducer.py`. You can identify this if the root object contains a `key_mapper` and a `data` key.
 
 **If the data is shrunken:**
 1.  The file content will look like this: `{"key_mapper":{"k0":"fileID",...},"data":[{"k0":"14854",...}]}`
 2.  You **MUST** use the `key_mapper` object to de-tokenize the keys in the `data` object before you can interpret the schema.
 3.  The `json_reducer` also removes keys with empty values (e.g., an empty `children` array `[]` will be removed). If a key like `children` is missing from a de-tokenized object, you can assume its value was empty.
 
-Once de-tokenized, the data will conform to the schemas described above.
+Once de-tokenized, the data will conform to the schemas described above.

+ 4 - 4
Assets/LLM/source/Documentation/3_LowLevel_Data_Guide.md

@@ -1,4 +1,4 @@
-# LLM Guide: Low-Level Project Data (`LowLevel.zip`)
+# LLM Guide: Low-Level Project Data (`LowLevel/`)
 
 ## 1. Purpose
 
@@ -11,7 +11,7 @@ This data level provides the **most granular, detailed, and raw view** of the pr
 
 ## 2. File Structure
 
-The zip file contains a highly detailed breakdown of the project, where scenes and prefabs are exploded into their constituent GameObjects.
+The `LowLevel` folder contains a highly detailed breakdown of the project, where scenes and prefabs are exploded into their constituent GameObjects.
 
 ```
 LowLevel/
@@ -79,11 +79,11 @@ This file contains a simple JSON list of the `fileID`s for all the root (top-lev
 
 ## 4. Handling Shrunken / Tokenized Data
 
-The JSON files in this zip may have been processed by `json_reducer.py`.
+The JSON files in this folder may have been processed by `json_reducer.py`.
 
 **If the data is shrunken:**
 1.  The file content will look like this: `{"key_mapper":{"k0":"GameObject",...},"data":{"k0":{...}}}`
 2.  You **MUST** use the `key_mapper` to de-tokenize the keys in the `data` object before you can interpret the schema.
 3.  The `json_reducer` also removes keys with empty values. If a key is missing from a de-tokenized object, you can assume its value was empty.
 
-Once de-tokenized, the data will conform to the schemas described above.
+Once de-tokenized, the data will conform to the schemas described above.

+ 5 - 5
Assets/LLM/source/Documentation/4_Query_Strategy_Guide.md

@@ -6,7 +6,7 @@ The project data is split into three levels of detail (High, Mid, Low) to enable
 
 **The workflow is always: High -> Mid -> Low.** Never start with the Low-level data unless you have a very specific `fileID` you need to look up.
 
-## 2. The Triage Process: Which Zip to Use?
+## 2. The Triage Process: Which Folder to Use?
 
 When you receive a user query, use the following decision process to select the correct data source.
 
@@ -15,7 +15,7 @@ When you receive a user query, use the following decision process to select the
 First, check if the question is about the project's global configuration, dependencies, or a general summary.
 
 - **Keywords:** "settings", "package", "version", "scenes in build", "tags", "layers", "render pipeline".
-- **Action:** Use **`HighLevel.zip`**.
+- **Action:** Use the **`HighLevel/`** folder.
 - **Example Queries:**
     - "What is the name of the game?" -> Read `manifest.json`.
     - "Which version of the Universal Render Pipeline is installed?" -> Read `packages.json`.
@@ -26,7 +26,7 @@ First, check if the question is about the project's global configuration, depend
 If the question is about the arrangement of objects within a scene/prefab, what components an object has, or where assets are, the Mid-level data is the correct source.
 
 - **Keywords:** "hierarchy", "list objects", "what components", "find all prefabs with", "children of", "location of".
-- **Action:** Use **`MidLevel.zip`**.
+- **Action:** Use the **`MidLevel/`** folder.
 - **Example Queries:**
     - "Show me the hierarchy of `SampleScene.unity`." -> Read `Assets/Scenes/SampleScene.json` and format the nested structure.
     - "What components are on the 'Player' GameObject?" -> Find the 'Player' object in the relevant scene/prefab JSON and list its `components`.
@@ -38,7 +38,7 @@ If the question is about the arrangement of objects within a scene/prefab, what
 Only proceed to this step if the user asks for a specific detail that is not present in the Mid-level hierarchy view. You should ideally already have the `fileID` of the target object from a previous Mid-level query.
 
 - **Keywords:** "exact position", "specific value", "rotation", "scale", "property of", "what is the speed".
-- **Action:** Use **`LowLevel.zip`**.
+- **Action:** Use the **`LowLevel/`** folder.
 - **Example Queries:**
     - "What are the exact local position coordinates of the 'Gun' object in the `Player` prefab?"
         1.  **Mid-Level:** First, open `Assets/CustomGame/Prefabs/Player.json`. Find the "Gun" object in the hierarchy to get its `fileID` (e.g., `101112`).
@@ -49,4 +49,4 @@ Only proceed to this step if the user asks for a specific detail that is not pre
 
 ## 4. De-Tokenization Reminder
 
-For all JSON files, **always check for tokenization first**. If the file contains a `key_mapper`, you must de-tokenize the `data` payload before attempting to process it according to the schemas and strategies outlined above.
+For all JSON files, **always check for tokenization first**. If the file contains a `key_mapper`, you must de-tokenize the `data` payload before attempting to process it according to the schemas and strategies outlined above.

+ 41 - 13
Assets/LLM/source/orchestrators/extract_high_level.py

@@ -19,6 +19,7 @@ def parse_physics_settings(input_dir, project_mode):
     """
     Parses the appropriate physics settings file based on the project mode.
     """
+    print(f"    -> Analyzing {project_mode} physics settings...")
     physics_data = {}
     if project_mode == "3D":
         asset_path = input_dir / "ProjectSettings" / "DynamicsManager.asset"
@@ -32,6 +33,8 @@ def parse_physics_settings(input_dir, project_mode):
                 physics_data['layerCollisionMatrix'] = settings.get('m_LayerCollisionMatrix')
                 physics_data['autoSimulation'] = settings.get('m_AutoSimulation')
                 physics_data['autoSyncTransforms'] = settings.get('m_AutoSyncTransforms')
+        else:
+            print("       ...DynamicsManager.asset not found.")
     else: # 2D
         asset_path = input_dir / "ProjectSettings" / "Physics2DSettings.asset"
         if asset_path.is_file():
@@ -44,6 +47,8 @@ def parse_physics_settings(input_dir, project_mode):
                 physics_data['layerCollisionMatrix'] = settings.get('m_LayerCollisionMatrix')
                 physics_data['autoSimulation'] = settings.get('m_AutoSimulation')
                 physics_data['autoSyncTransforms'] = settings.get('m_AutoSyncTransforms')
+        else:
+            print("       ...Physics2DSettings.asset not found.")
     
     return physics_data
 
@@ -51,12 +56,13 @@ def parse_project_settings(input_dir, output_dir, indent=None, shrink=False, ign
     """
     Parses various project settings files to create a comprehensive manifest.
     """
-    print("\n--- Starting Task 2: Comprehensive Project Settings Parser ---")
-    
     manifest_data = {}
+    print("--> Generating GUID to Path map...")
     guid_map = create_guid_to_path_map(str(input_dir), ignored_folders=ignored_folders)
+    print("    ...GUID map generated.")
 
     # --- ProjectSettings.asset ---
+    print("--> Parsing ProjectSettings.asset...")
     project_settings_path = input_dir / "ProjectSettings" / "ProjectSettings.asset"
     if project_settings_path.is_file():
         docs = load_unity_yaml(str(project_settings_path))
@@ -127,16 +133,22 @@ def parse_project_settings(input_dir, output_dir, indent=None, shrink=False, ign
             
             manifest_data['scriptingBackend'] = final_scripting_backends
             manifest_data['apiCompatibilityLevel'] = final_api_levels
+    else:
+        print("    ...ProjectSettings.asset not found.")
 
     # --- EditorSettings.asset for 2D/3D Mode ---
+    print("--> Parsing EditorSettings.asset...")
     editor_settings_path = input_dir / "ProjectSettings" / "EditorSettings.asset"
     if editor_settings_path.is_file():
         docs = load_unity_yaml(str(editor_settings_path))
         if docs:
             editor_settings = convert_to_plain_python_types(docs[0]).get('EditorSettings', {})
             manifest_data['projectMode'] = "2D" if editor_settings.get('m_DefaultBehaviorMode') == 1 else "3D"
+    else:
+        print("    ...EditorSettings.asset not found.")
 
     # --- GraphicsSettings.asset for Render Pipeline ---
+    print("--> Parsing GraphicsSettings.asset...")
     graphics_settings_path = input_dir / "ProjectSettings" / "GraphicsSettings.asset"
     manifest_data['renderPipeline'] = 'Built-in'
     if graphics_settings_path.is_file():
@@ -153,9 +165,12 @@ def parse_project_settings(input_dir, output_dir, indent=None, shrink=False, ign
                     if "URP" in asset_path: manifest_data['renderPipeline'] = 'URP'
                     elif "HDRP" in asset_path: manifest_data['renderPipeline'] = 'HDRP'
                     else: manifest_data['renderPipeline'] = 'Scriptable'
+    else:
+        print("    ...GraphicsSettings.asset not found.")
 
 
     # --- TagManager.asset ---
+    print("--> Parsing TagManager.asset...")
     tag_manager_path = input_dir / "ProjectSettings" / "TagManager.asset"
     if tag_manager_path.is_file():
         docs = load_unity_yaml(str(tag_manager_path))
@@ -165,8 +180,11 @@ def parse_project_settings(input_dir, output_dir, indent=None, shrink=False, ign
             layers_list = tag_manager.get('layers', [])
             # Only include layers that have a name, preserving their index
             manifest_data['layers'] = {i: name for i, name in enumerate(layers_list) if name}
+    else:
+        print("    ...TagManager.asset not found.")
 
     # --- EditorBuildSettings.asset ---
+    print("--> Parsing EditorBuildSettings.asset...")
     build_settings_path = input_dir / "ProjectSettings" / "EditorBuildSettings.asset"
     if build_settings_path.is_file():
         docs = load_unity_yaml(str(build_settings_path))
@@ -176,8 +194,11 @@ def parse_project_settings(input_dir, output_dir, indent=None, shrink=False, ign
                 {'path': scene.get('path'), 'enabled': scene.get('enabled') == 1}
                 for scene in build_settings.get('m_Scenes', [])
             ]
+    else:
+        print("    ...EditorBuildSettings.asset not found.")
 
     # --- TimeManager.asset ---
+    print("--> Parsing TimeManager.asset...")
     time_manager_path = input_dir / "ProjectSettings" / "TimeManager.asset"
     if time_manager_path.is_file():
         docs = load_unity_yaml(str(time_manager_path))
@@ -190,15 +211,18 @@ def parse_project_settings(input_dir, output_dir, indent=None, shrink=False, ign
                 'm_TimeScale': time_manager.get('m_TimeScale'),
                 'Maximum Particle Timestep': time_manager.get('Maximum Particle Timestep')
             }
+    else:
+        print("    ...TimeManager.asset not found.")
 
     # --- Physics Settings ---
+    print("--> Parsing physics settings...")
     manifest_data['physicsSettings'] = parse_physics_settings(input_dir, manifest_data.get('projectMode', '3D'))
 
     # --- Write manifest.json ---
     manifest_output_path = output_dir / "manifest.json"
     try:
         write_json(manifest_data, manifest_output_path, indent=indent, shrink=shrink)
-        print(f"Successfully created manifest.json at {manifest_output_path}")
+        print(f"--> Successfully created manifest.json at {manifest_output_path}")
     except Exception as e:
         print(f"Error writing to {manifest_output_path}. {e}", file=sys.stderr)
 
@@ -207,18 +231,17 @@ def parse_package_manifests(input_dir, output_dir, indent=None, shrink=False):
     """
     Parses the primary package manifest and creates a clean packages.json file.
     """
-    print("\n--- Starting Task 3: Package Manifest Extractor ---")
-    
     manifest_path = input_dir / "Packages" / "manifest.json"
     
     if manifest_path.is_file():
         try:
+            print(f"--> Found package manifest at {manifest_path}")
             with open(manifest_path, 'r', encoding='utf-8') as f:
                 packages_data = json.load(f)
             
             packages_output_path = output_dir / "packages.json"
             write_json(packages_data, packages_output_path, indent=indent, shrink=shrink)
-            print(f"Successfully created packages.json at {packages_output_path}")
+            print(f"--> Successfully created packages.json at {packages_output_path}")
 
         except (IOError, json.JSONDecodeError) as e:
             print(f"Error processing {manifest_path}: {e}", file=sys.stderr)
@@ -257,17 +280,22 @@ def main():
         print(f"Error: Could not create output directory '{high_level_output_dir}'. {e}", file=sys.stderr)
         sys.exit(1)
 
+    print("\n--- Running High-Level Extraction ---")
+
+    print("\n[1/2] Parsing project settings...")
     parse_project_settings(
-        input_dir, 
-        high_level_output_dir, 
-        indent=indent_level, 
-        shrink=shrink_json, 
+        input_dir,
+        high_level_output_dir,
+        indent=indent_level,
+        shrink=shrink_json,
         ignored_folders=ignored_folders
     )
+
+    print("\n[2/2] Parsing package manifests...")
     parse_package_manifests(
-        input_dir, 
-        high_level_output_dir, 
-        indent=indent_level, 
+        input_dir,
+        high_level_output_dir,
+        indent=indent_level,
         shrink=shrink_json
     )
 

+ 49 - 21
Assets/LLM/source/orchestrators/extract_low_level.py

@@ -11,41 +11,54 @@ sys.path.append(str(utils_path))
 from config_utils import load_config
 
 def run_subprocess(script_name, input_dir, output_dir, indent=None, shrink=False, ignored_folders=None):
-    """Helper function to run a parser subprocess."""
+    """Helper function to run a parser subprocess and stream its output."""
     script_path = Path(__file__).parent.parent / "parsers" / script_name
     command = [
         sys.executable,
         str(script_path),
-        "--input",
-        str(input_dir),
-        "--output",
-        str(output_dir)
+        "--input", str(input_dir),
+        "--output", str(output_dir)
     ]
-    
-    # Pass indent and shrink arguments to subparsers.
-    # This assumes the subparsers have been updated to accept them.
+
     if indent is not None:
         command.extend(["--indent", str(indent)])
     if shrink:
         command.append("--shrink-json")
-    
     if ignored_folders:
         command.extend(["--ignored-folders", json.dumps(ignored_folders)])
 
+    # For logging, create a display-friendly version of the command
+    display_command = ' '.join(f'"{c}"' if ' ' in c else c for c in command)
+    print(f"--> Executing: {display_command}")
+
     try:
-        result = subprocess.run(command, check=True, text=True, capture_output=True, encoding='utf-8')
-        if result.stdout:
-            print(result.stdout)
-        if result.stderr:
-            print(result.stderr, file=sys.stderr)
-
-    except subprocess.CalledProcessError as e:
-        print(f"--- ERROR in {script_name} ---")
-        print(e.stdout)
-        print(e.stderr, file=sys.stderr)
-        print(f"--- End of Error ---")
+        # Use Popen to stream output in real-time
+        process = subprocess.Popen(
+            command,
+            stdout=subprocess.PIPE,
+            stderr=subprocess.PIPE,
+            text=True,
+            encoding='utf-8',
+            bufsize=1  # Line-buffered
+        )
+
+        # Stream stdout
+        for line in process.stdout:
+            print(line, end='')
+
+        # Stream stderr
+        for line in process.stderr:
+            print(line, end='', file=sys.stderr)
+
+        process.wait()  # Wait for the subprocess to finish
+
+        if process.returncode != 0:
+            print(f"--- ERROR in {script_name} completed with exit code {process.returncode} ---", file=sys.stderr)
+
     except FileNotFoundError:
         print(f"--- ERROR: Script not found at {script_path} ---", file=sys.stderr)
+    except Exception as e:
+        print(f"--- An unexpected error occurred while running {script_name}: {e} ---", file=sys.stderr)
 
 
 def main():
@@ -77,12 +90,27 @@ def main():
     print(f"Output will be saved to: {low_level_output_dir}")
 
     # --- Run Extraction Pipeline ---
-    # Pass all relevant config options to the subparsers.
+    print("\n--- Running Low-Level Extraction Pipeline ---")
+    
+    print("\n[1/5] Starting: Copy scripts...")
     run_subprocess("copy_scripts.py", input_dir, low_level_output_dir, indent_level, shrink_json, ignored_folders)
+    print("--> Finished: Copy scripts.")
+    
+    print("\n[2/5] Starting: Copy shaders...")
     run_subprocess("copy_shaders.py", input_dir, low_level_output_dir, indent_level, shrink_json, ignored_folders)
+    print("--> Finished: Copy shaders.")
+    
+    print("\n[3/5] Starting: Parse project settings...")
     run_subprocess("parse_project_settings.py", input_dir, low_level_output_dir, indent_level, shrink_json, ignored_folders)
+    print("--> Finished: Parse project settings.")
+    
+    print("\n[4/5] Starting: Parse generic assets...")
     run_subprocess("parse_generic_assets.py", input_dir, low_level_output_dir, indent_level, shrink_json, ignored_folders)
+    print("--> Finished: Parse generic assets.")
+    
+    print("\n[5/5] Starting: Parse scenes and prefabs...")
     run_subprocess("parse_scenes_and_prefabs.py", input_dir, low_level_output_dir, indent_level, shrink_json, ignored_folders)
+    print("--> Finished: Parse scenes and prefabs.")
 
     print("\nLow-level extraction pipeline complete.")
 

+ 48 - 12
Assets/LLM/source/orchestrators/extract_mid_level.py

@@ -18,14 +18,14 @@ def generate_guid_mappers(input_dir, output_dir, indent=None, shrink=False, igno
     """
     Finds all .meta files and generates JSON files mapping GUIDs to asset paths.
     """
-    print("\n--- Starting GUID Mapper Generation ---")
     assets_dir = input_dir / "Assets"
     if not assets_dir.is_dir():
         print(f"Error: 'Assets' directory not found in '{input_dir}'", file=sys.stderr)
         return
 
+    print("--> Finding all .meta files...")
     meta_files = find_files_by_extension(str(assets_dir), '.meta', ignored_folders=ignored_folders, project_root=input_dir)
-    print(f"Found {len(meta_files)} .meta files to process.")
+    print(f"--> Found {len(meta_files)} .meta files to process.")
 
     asset_type_map = {
         '.prefab': 'prefabs', '.unity': 'scenes', '.mat': 'materials',
@@ -36,7 +36,12 @@ def generate_guid_mappers(input_dir, output_dir, indent=None, shrink=False, igno
     guid_maps = {value: {} for value in asset_type_map.values()}
     guid_maps['others'] = {}
 
-    for meta_file_path_str in meta_files:
+    print("--> Parsing .meta files and mapping GUIDs to asset paths...")
+    total_files = len(meta_files)
+    for i, meta_file_path_str in enumerate(meta_files):
+        if (i + 1) % 250 == 0:
+            print(f"    ...processed {i+1}/{total_files} .meta files...")
+
         meta_file_path = Path(meta_file_path_str)
         asset_file_path = Path(meta_file_path_str.rsplit('.meta', 1)[0])
 
@@ -60,17 +65,20 @@ def generate_guid_mappers(input_dir, output_dir, indent=None, shrink=False, igno
             # Use the full path from the project root for the guid_map value
             full_asset_path = input_dir / asset_file_path
             guid_maps[asset_type][guid] = full_asset_path.as_posix()
+    print(f"    ...finished processing all {total_files} .meta files.")
 
     mappers_dir = output_dir / "GuidMappers"
     try:
         mappers_dir.mkdir(parents=True, exist_ok=True)
+        print(f"--> Writing GUID mapper files to {mappers_dir}...")
         for asset_type, guid_map in guid_maps.items():
             if guid_map:
                 output_path = mappers_dir / f"{asset_type}.json"
                 # For the output JSON, we still want the project-relative path
                 relative_guid_map = {g: Path(p).relative_to(input_dir).as_posix() for g, p in guid_map.items()}
                 write_json(relative_guid_map, output_path, indent=indent, shrink=shrink)
-        print(f"Successfully created GUID mappers in {mappers_dir}")
+                print(f"    -> Created {asset_type}.json with {len(guid_map)} entries.")
+        print(f"--> Successfully created all GUID mappers.")
     except OSError as e:
         print(f"Error: Could not create GUID mapper directory or files. {e}", file=sys.stderr)
     
@@ -131,12 +139,15 @@ def main():
         print(f"Warning: 'Assets' directory not found in '{input_dir}'. Skipping all processing.", file=sys.stderr)
         return
 
+    print("\n--- Running Mid-Level Extraction ---")
+
     # --- Task 1: Replicate 'Assets' directory structure ---
-    print(f"\n--- Replicating 'Assets' directory structure ---")
+    print("\n[1/3] Replicating 'Assets' directory structure...")
     replicate_directory_structure(str(assets_dir), str(output_assets_dir), ignored_folders=ignored_folders, project_root=input_dir)
-    print("Directory structure replication complete.")
+    print("--> Directory structure replication complete.")
 
     # --- Task 2: Generate GUID Map and Mappers ---
+    print("\n[2/3] Generating GUID Mappers...")
     guid_map = generate_guid_mappers(
         input_dir, 
         mid_level_output_dir, 
@@ -144,32 +155,57 @@ def main():
         shrink=shrink_json, 
         ignored_folders=ignored_folders
     )
+    print("--> GUID Mapper generation complete.")
 
     # --- Task 3: Orchestrate Scene and Prefab Parsing for Hierarchy ---
-    print("\n--- Starting Scene/Prefab Hierarchy Parsing ---")
+    print("\n[3/3] Parsing Scene and Prefab Hierarchies...")
     
+    print("--> Finding scene and prefab files...")
     scene_files = find_files_by_extension(str(assets_dir), '.unity', ignored_folders=ignored_folders, project_root=input_dir)
     prefab_files = find_files_by_extension(str(assets_dir), '.prefab', ignored_folders=ignored_folders, project_root=input_dir)
     files_to_process = scene_files + prefab_files
     
-    print(f"Found {len(files_to_process)} scene/prefab files to process for hierarchy.")
+    print(f"--> Found {len(files_to_process)} total scene/prefab files to process.")
 
-    for file_path_str in files_to_process:
+    total_files = len(files_to_process)
+    for i, file_path_str in enumerate(files_to_process):
         file_path = Path(file_path_str)
         
         relative_path = file_path.relative_to(assets_dir)
         output_json_path = (output_assets_dir / relative_path).with_suffix('.json')
         
         try:
-            print(f"\n--- Processing Hierarchy for: {file_path.name} ---")
+            print(f"\n--- Processing {file_path.name} ({i+1}/{total_files}) ---")
             
             # Use the sophisticated processor for building the visual tree
             processor = UnitySceneProcessor(guid_map)
-            hierarchy = processor.process_file(file_path)
+            
+            print(f"    -> Loading and parsing file...")
+            if not processor.load_documents(file_path):
+                print(f"Warning: Could not load or parse {file_path.name}. Skipping.", file=sys.stderr)
+                continue
+
+            print(f"    -> Pass 1/6: Building relationship maps and creating basic nodes...")
+            processor.process_first_pass()
+
+            print(f"    -> Pass 2/6: Building hierarchy relationships...")
+            processor.process_second_pass()
+
+            print(f"    -> Pass 3/6: Verifying and fixing parent-child relationships...")
+            processor.verification_pass()
+
+            print(f"    -> Pass 4/6: Extracting components...")
+            processor.process_third_pass()
+
+            print(f"    -> Pass 5/6: Merging prefab data...")
+            processor.merge_prefab_data_pass()
+            
+            print(f"    -> Pass 6/6: Assembling final hierarchy...")
+            hierarchy = processor.get_hierarchy()
             
             output_json_path.parent.mkdir(parents=True, exist_ok=True)
             write_json(hierarchy, output_json_path, indent=indent_level, shrink=shrink_json)
-            print(f"Successfully processed hierarchy for {file_path.name} -> {output_json_path}")
+            print(f"--> Successfully processed hierarchy for {file_path.name} -> {output_json_path}")
 
         except Exception as e:
             print(f"Error processing hierarchy for {file_path.name}: {e}", file=sys.stderr)

BIN
Assets/LLM/source/parsers/__pycache__/scene_processor.cpython-313.pyc


+ 19 - 7
Assets/LLM/source/parsers/parse_scenes_and_prefabs.py

@@ -46,10 +46,13 @@ def main():
     print(f"\n--- Starting Scene/Prefab Parsing ---")
     print(f"Found {len(files_to_process)} files to process.")
 
-    for file_path_str in files_to_process:
+    total_files = len(files_to_process)
+    for i, file_path_str in enumerate(files_to_process):
         file_path = Path(file_path_str)
-        print(f"\nProcessing: {file_path.name}")
+        print(f"\n--- Processing file {i+1}/{total_files}: {file_path.relative_to(input_dir)} ---")
 
+        # --- Step 1: Deep Parsing ---
+        print("  [1/2] Starting deep parse to extract all GameObjects...")
         gameobject_list = parse_scene_or_prefab(str(file_path))
 
         relative_path = file_path.relative_to(input_dir)
@@ -57,23 +60,30 @@ def main():
         asset_output_dir.mkdir(parents=True, exist_ok=True)
 
         if gameobject_list:
-            print(f"Saving {len(gameobject_list)} GameObjects to {asset_output_dir}")
+            print(f"    -> Deep parse complete. Found {len(gameobject_list)} GameObjects.")
+            print(f"    -> Saving individual GameObject files to {asset_output_dir}...")
             for go_data in gameobject_list:
                 file_id = go_data.get('fileID')
                 if file_id:
                     output_json_path = asset_output_dir / f"{file_id}.json"
                     write_json(go_data, output_json_path, indent=args.indent, shrink=args.shrink_json)
+            print(f"    -> Finished saving GameObject files.")
         else:
-            print(f"Skipped deep parsing for {file_path.name}.")
+            print(f"    -> Skipped deep parsing for {file_path.name} (no objects found or file was empty).")
 
+        # --- Step 2: Hierarchy Parsing ---
+        print("  [2/2] Starting hierarchy parse to identify root objects...")
         try:
             documents = load_unity_yaml(file_path)
             if not documents:
+                print("    -> Skipping hierarchy parse (file is empty or could not be read).")
                 continue
 
+            print("    -> Converting YAML for hierarchy analysis...")
             raw_object_map = {int(doc.anchor.value): doc for doc in documents if hasattr(doc, 'anchor') and doc.anchor is not None}
             object_map = {file_id: convert_to_plain_python_types(obj) for file_id, obj in raw_object_map.items()}
 
+            print("    -> Analyzing transform hierarchy to find roots...")
             parser = HierarchyParser(object_map)
             root_object_ids = parser.get_root_object_ids()
             
@@ -82,12 +92,14 @@ def main():
             if root_ids_list:
                 roots_output_path = asset_output_dir / "root_objects.json"
                 write_json(root_ids_list, roots_output_path, indent=args.indent, shrink=args.shrink_json)
-                print(f"Successfully saved root object list to {roots_output_path}")
+                print(f"    -> Successfully saved {len(root_ids_list)} root object IDs to {roots_output_path}")
+            else:
+                print("    -> No root objects found.")
 
         except Exception as e:
-            print(f"Error during hierarchy parsing for {file_path.name}: {e}", file=sys.stderr)
+            print(f"    -> Error during hierarchy parsing for {file_path.name}: {e}", file=sys.stderr)
 
-    print("Scene and prefab parsing complete.")
+    print("\nScene and prefab parsing complete.")
 
 if __name__ == "__main__":
     main()

+ 22 - 13
Assets/LLM/source/parsers/scene_processor.py

@@ -431,24 +431,19 @@ class UnitySceneProcessor:
             if 'children' in node and node['children']:
                 self.cleanup_pass(node['children'])
 
-    def process_file(self, file_path):
-        """Main processing method"""
-        # Load and parse the file
+    def load_documents(self, file_path):
+        """Loads and parses the yaml documents from a scene/prefab file."""
         documents = load_unity_yaml(file_path)
         if not documents:
-            return []
-
-        # Build object map
+            self.object_map = {}
+            return False
+        
         raw_object_map = {int(doc.anchor.value): doc for doc in documents if hasattr(doc, 'anchor') and doc.anchor is not None}
         self.object_map = {file_id: convert_to_plain_python_types(obj) for file_id, obj in raw_object_map.items()}
+        return True
 
-        # Process in passes
-        self.process_first_pass()
-        self.process_second_pass()
-        self.verification_pass()
-        self.process_third_pass()
-        self.merge_prefab_data_pass()
-
+    def get_hierarchy(self):
+        """Builds and returns the final, cleaned-up hierarchy from the processed data."""
         # Use the centralized parser to get the final, sorted root objects
         parser = HierarchyParser(self.object_map)
         root_object_ids = parser.get_root_object_ids()
@@ -465,6 +460,20 @@ class UnitySceneProcessor:
 
         return root_nodes
 
+    def process_file(self, file_path):
+        """Main processing method"""
+        if not self.load_documents(file_path):
+            return []
+
+        # Process in passes
+        self.process_first_pass()
+        self.process_second_pass()
+        self.verification_pass()
+        self.process_third_pass()
+        self.merge_prefab_data_pass()
+
+        return self.get_hierarchy()
+
 
 if __name__ == "__main__":
     # This script is intended to be used as a module.

BIN
Assets/LLM/source/utils/__pycache__/deep_parser.cpython-313.pyc


+ 10 - 0
Assets/LLM/source/utils/deep_parser.py

@@ -28,14 +28,17 @@ def parse_scene_or_prefab(file_path, guid_map=None, assets_dir=None):
     if not documents:
         return None
 
+    print("    -> Converting YAML documents to Python objects...")
     raw_object_map = {int(doc.anchor.value): doc for doc in documents if hasattr(doc, 'anchor') and doc.anchor is not None}
     object_map = {file_id: convert_to_plain_python_types(obj) for file_id, obj in raw_object_map.items()}
+    print(f"    -> Found {len(object_map)} objects in the file.")
 
     flat_gameobject_list = []
     transform_to_gameobject = {}
     parent_to_children_transforms = {}
 
     # First pass: Populate relationship maps
+    print("    -> Pass 1/2: Mapping object relationships...")
     for file_id, obj_data in object_map.items():
         if 'Transform' in obj_data:
             parent_id = obj_data['Transform'].get('m_Father', {}).get('fileID')
@@ -53,8 +56,15 @@ def parse_scene_or_prefab(file_path, guid_map=None, assets_dir=None):
                     break
 
     # Second pass: Process each GameObject
+    print("    -> Pass 2/2: Extracting GameObject data and components...")
+    total_gos = sum(1 for obj in object_map.values() if 'GameObject' in obj)
+    processed_gos = 0
     for file_id, obj_data in object_map.items():
         if 'GameObject' in obj_data:
+            processed_gos += 1
+            if processed_gos % 100 == 0:
+                print(f"        ...processed {processed_gos}/{total_gos} GameObjects...")
+
             go_info = obj_data['GameObject']
             
             source_obj_info = go_info.get('m_CorrespondingSourceObject')