Conversation
| folder_name: Optional[str] = None, | ||
| end_user_id: Optional[str] = None, | ||
| use_colpali: Optional[bool] = None, | ||
| retrieve_images: Optional[bool] = None, |
There was a problem hiding this comment.
Parameter 'retrieve_images' creates inconsistency with similar parameter 'use_colpali' used in other methods
| system_msg, _ = dummy.captured_messages | ||
| assert system_msg['content'].startswith(expected_system_prefix) | ||
| assert entities and entities[0].label == "TestEntity" | ||
| assert relationships and relationships[0].type == "related_to" |
There was a problem hiding this comment.
Attribute error: accessing 'type' but RelationshipExtraction uses 'relationship' attribute
|
|
||
| if is_image: | ||
| # For images, add the image as a content block | ||
| user_message_content.append({"type": "image_url", "image_url": {"url": content_limited}}) |
There was a problem hiding this comment.
Base64 image content is incorrectly passed as a URL instead of a data URI when constructing the message for the LLM
| # We assume the custom prompt handles incorporating text and image content appropriately. | ||
| # For simplicity, we'll just pass the original content_limited and examples_str | ||
| # to the custom prompt formatter. The user is responsible for formatting in the template. | ||
| formatted_user_text = custom_prompt.format(content=content_limited, examples=examples_str) |
There was a problem hiding this comment.
Custom prompt handling doesn't account for multimodal image content
| # Extract entities and relationships from the chunk | ||
| chunk_entities, chunk_relationships = await self.extract_entities_from_text( | ||
| chunk.content, chunk.document_id, chunk.chunk_number, extraction_overrides | ||
| chunk.content, chunk.document_id, chunk.chunk_number, extraction_overrides, override_is_image=chunk.metadata.get("is_image", False) |
There was a problem hiding this comment.
We should change the method name to extract entities
| if is_image: | ||
| content_limited = content | ||
| else: | ||
| content_limited = content[: min(len(content), 5000)] |
| else: | ||
| # For text, use standard extraction instructions | ||
| system_content = ( | ||
| "You are an entity extraction and relationship extraction assistant. " |
There was a problem hiding this comment.
We should keep the original system prompt pls! I tested it and works well. Also maybe for images, we could just append saying context is image. That should do the job.
| # For simplicity, we'll just pass the original content_limited and examples_str | ||
| # to the custom prompt formatter. The user is responsible for formatting in the template. | ||
| formatted_user_text = custom_prompt.format(content=content_limited, examples=examples_str) | ||
| user_message = {"role": "user", "content": formatted_user_text} |
There was a problem hiding this comment.
Even for custom prompts, we should add the image!!!
No description provided.