get graph service to work with images#112

Open

ArnavAgrawal03 wants to merge 1 commit intomainfrom

Collaborator

ArnavAgrawal03 commented Apr 24, 2025

No description provided.


          get graph service to work with images

327ae3f

cubic-dev-ai bot reviewed

View reviewed changes

cubic-dev-ai bot left a comment

mrge found 4 issues across 3 files. View them in mrge.io

core/services/document_service.py

                       folder_name: Optional[str] = None,
                       end_user_id: Optional[str] = None,
-                      use_colpali: Optional[bool] = None,
+                      retrieve_images: Optional[bool] = None,

cubic-dev-ai bot Apr 24, 2025

Parameter 'retrieve_images' creates inconsistency with similar parameter 'use_colpali' used in other methods

core/tests/unit/test_graph_service_image_extraction.py

+                  system_msg, _ = dummy.captured_messages
+                  assert system_msg['content'].startswith(expected_system_prefix)
+                  assert entities and entities[0].label == "TestEntity"
+                  assert relationships and relationships[0].type == "related_to"

cubic-dev-ai bot Apr 24, 2025

Attribute error: accessing 'type' but RelationshipExtraction uses 'relationship' attribute

core/services/graph_service.py

+                      if is_image:
+                          # For images, add the image as a content block
+                          user_message_content.append({"type": "image_url", "image_url": {"url": content_limited}})

cubic-dev-ai bot Apr 24, 2025

Base64 image content is incorrectly passed as a URL instead of a data URI when constructing the message for the LLM

core/services/graph_service.py

+                          # We assume the custom prompt handles incorporating text and image content appropriately.
+                          # For simplicity, we'll just pass the original content_limited and examples_str
+                          # to the custom prompt formatter. The user is responsible for formatting in the template.
+                          formatted_user_text = custom_prompt.format(content=content_limited, examples=examples_str)

cubic-dev-ai bot Apr 24, 2025

Custom prompt handling doesn't account for multimodal image content

Adityav369 requested changes

View reviewed changes

core/services/graph_service.py

                               # Extract entities and relationships from the chunk
                               chunk_entities, chunk_relationships = await self.extract_entities_from_text(
-                                  chunk.content, chunk.document_id, chunk.chunk_number, extraction_overrides
+                                  chunk.content, chunk.document_id, chunk.chunk_number, extraction_overrides, override_is_image=chunk.metadata.get("is_image", False)

Collaborator

Adityav369 Apr 24, 2025

We should change the method name to extract entities

core/services/graph_service.py

+                      if is_image:
+                          content_limited = content
+                      else:
+                          content_limited = content[: min(len(content), 5000)]

Collaborator

Adityav369 Apr 24, 2025

We could remove this

core/services/graph_service.py

+                      else:
+                          # For text, use standard extraction instructions
+                          system_content = (
+                              "You are an entity extraction and relationship extraction assistant. "

Collaborator

Adityav369 Apr 24, 2025

We should keep the original system prompt pls! I tested it and works well. Also maybe for images, we could just append saying context is image. That should do the job.

core/services/graph_service.py

+                          # For simplicity, we'll just pass the original content_limited and examples_str
+                          # to the custom prompt formatter. The user is responsible for formatting in the template.
+                          formatted_user_text = custom_prompt.format(content=content_limited, examples=examples_str)
+                          user_message = {"role": "user", "content": formatted_user_text}

Collaborator

Adityav369 Apr 24, 2025

Even for custom prompts, we should add the image!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet