fast_prod_new.py – Poora Code Hinglish Mein
1. Service Kya Karta Hai (Overview)
Ye ek FastAPI service hai jo InternVL v2.5 1B vision model use karke reverse pickup verification karta hai:
- Vendor images (product listing) aur pickup images (customer return) dono analyze karta hai
- In dono ko compare karke batata hai: product match ho raha hai ya nahi, damage hai ya nahi, quantity/size/color sahi hai ya nahi
- Response structured JSON mein aata hai (image_analysis, verification_results, pickup_confidence)
- Caching band hai – har request fresh process hoti hai
- Inference ke baad turant response bhejta hai, DB save background mein hota hai
2. Imports & Setup
- sql_conn – DB insert (e.g.
conn.insert_single_dict_simple,conn.create_table_from_dict) - asyncio, aiohttp – async image download, prefetch workers
- FastAPI, Pydantic – API, request/response models, validation
- torch, transformers – model load, inference
- PIL, torchvision – image resize, normalize, tensor banana
- jsonschema, fuzzywuzzy, pyaml, ast – response validate karna, JSON/YAML parse, string matching
3. Error Codes (Lines 49–60)
ErrorCodes class mein sab jagah same code 1099 use ho raha hai (general/validation/image/model/timeout/etc.). Error type alag hai, numeric code same hai.
4. Config (Lines 63–96)
- MODEL_PATH –
"full_14_08_2025_ch_28k"(InternVL model folder) - HOST/PORT –
0.0.0.0:8001(env se override) - MAX_IMAGE_SIZE – 10MB
- REQUEST_TIMEOUT – 300s
- MAX_CONCURRENT_REQUESTS – 10 (kitne requests ek saath inference kar sakte hain)
- IMAGE_SIZE – 448 (preprocess size)
- PICKUP_PATCH – 2 (pickup images ke liye patch count)
- generation_config – max_new_tokens 2000, do_sample False
- MAX_IMAGES_PER_REQUEST – 20
- THREAD_POOL_SIZE – 16
- PREFETCH_FACTOR / PREFETCH_WORKERS / PIPELINE_QUEUE_SIZE – prefetch pipeline ke liye (1, 2, 5)
- Caching related config remove kar diya gaya hai
5. JSON Schema (Lines 99–199)
Response ka exact structure define hai:
- image_analysis – pickup_images, vendor_image, comparison (strings)
- verification_results – array of objects: Type, Label, IsMandatory, ValueToCheck, Result, Evidence, Confidence, isRetakeImage
- pickup_confidence – overall_score, confidence_factors (image_quality, attribute_visibility, matching_confidence), recommendation (Accept/Reject/Partial/Retry)
validate_json_schema() isi schema se model output validate karta hai; fail hone par “Fail inference” return hota hai.
6. Logging (200–209)
- Level = Config se (default INFO)
- Logs console +
internvl_api.logdono pe jaate hain
7. Validation & Exception (211–239)
- validate_json_schema(data) – jsonschema se validate, True/False + error message
- ProcessingError – custom exception: message, error_code, status_code, request_uuid, processing_time
- create_error_response() – standard error dict: errorMsg, code, processingTime, previousuuid
8. Prefetching – PrefetchedData (243–254)
PrefetchedData dataclass:
- request_id
- vendor_images, pickup_images (PIL)
- vendor_tensors, pickup_tensors (model input)
- metadata, timestamp, processing_time
Ye woh cheez hai jo prefetch worker prepare karke processed_queue mein daalta hai.
9. ModelState (258–328)
Model + prefetch pipeline ka single state:
- model, tokenizer, is_loaded, load_error
- Semaphore – max 10 concurrent requests
- Thread pools – general (16 workers), prefetch (2 workers)
- aiohttp session – image download (get_session)
- prefetch_queue – incoming requests (max 5)
- processed_queue – preprocessed PrefetchedData (max 5)
- prefetch_tasks – background worker tasks
Methods:
- get_semaphore() – concurrency limit
- get_thread_pool() / get_prefetch_pool() – thread pools
- start_prefetch_pipeline() – PREFETCH_FACTOR (1) workers start karta hai
- _prefetch_worker(worker_id) – queue se request leta hai, _preprocess_request call karta hai, result processed_queue mein daalta hai
- _preprocess_request(request_data) – vendor URLs se async download, pickup base64/URL decode, dono ko load_image se tensor banata hai, PrefetchedData return
- submit_for_prefetch(request, request_id) – request ko prefetch_queue mein daalna (non-blocking, queue full ho to skip)
- get_prefetched_data(timeout) – processed_queue se data lena (short timeout)
Caching bilkul nahi – sirf prefetch + queue, koi cache store nahi.
10. Request/Response Models (330–608)
RequestItem (Pydantic):
- pickupproductimage – 1–10 URLs (vendor/reference images)
- skupickdone – 1–10 items (base64 ya https URL – pickup images)
- productname, question, shippingid – required
- productid, uuid, previousuuid – optional
- Validators: URLs http/https, base64/S3 valid, productname/question/shippingid non-empty/length check; uuid na ho to auto UUID generate
ProcessingResponse:
- shippingid, uuid, product_id, processing_time_ai, Question (list), ai_decision, finalAction, ai_match_score, analysis, status, StatusCode, total_time, timestamp
11. Image Processing (612–693)
- IMAGENET_MEAN / STD – normalization
- build_transform() – RGB, resize (input_size x input_size), ToTensor, Normalize
- find_closest_aspect_ratio() – aspect ratio ko predefined ratios se match karke best (width, height) choose karta hai
- dynamic_preprocess() – image ko patches mein split (min_num, max_num, image_size), thumbnail option; model ke expected grid ke hisaab se crop
- load_image(image_file, input_size, max_num) – transform + dynamic_preprocess se list of tensors, phir torch.stack
- rewrite_s3_to_cloudfront() –
xbeestest.s3.amazonaws.comko CloudFront domain se replace
Vendor images max_num=1, pickup max_num=config.PICKUP_PATCH (2) use karte hain.
12. Async Image Download & Decode (598–733)
- download_image_async(session, url, uuid) – aiohttp se GET, size check (MAX_IMAGE_SIZE), PIL RGB; fail pe ProcessingError
- decode_base64_image_batch(b64_strings, uuid) – har item: agar https URL hai to CloudFront rewrite + requests.get (max 2 retries), warna base64.b64decode; size check; PIL RGB; koi bhi fail pe ProcessingError
13. Data Processing – Question & JSON (836–893)
- process_verification_data(json_string) –
request.questionko parse karke Question list nikalta hai; try: json.loads double/ single / ast.literal_eval; fail pe ProcessingError - number_to_word(total) – 1–59 number ko “one”, “two”, … “fifty-nine” mein convert (prompt mein “one image”, “two images” ke liye)
- extract_json_regex_fields() – string se regex se image_analysis, verification_results, pickup_confidence nikal ke dict
- extract_json(input_string) – pehle pyaml.safe_load, phir json.loads, phir regex se
{...}dhundh ke json/ast/pyaml/regex try; model ke text response se JSON nikalne ke liye
14. Prompt & Verification Fixing (895–1170)
- prompt_filter – question type categorize: category_verification, design_matching, quantity_verification, brand_verification, packaging_verification, size_verification, color_verification, price_verification, seal_verification, label_verification, sku_verification, product_matching (keywords se)
- convert_to_regex() – in keywords ka word-boundary regex
- contains_patterns(val) – label/question text se type decide (e.g. “category” → category_verification)
- type_check, opposite_boolean, fix_bool – boolean/numeric values aur unke opposite (y/n, true/false, etc.)
- damage_keywords, usage_keywords, neg_value – damage/usage labels aur negative values ke liye
- std_prompt – har verification type ke liye Result + Evidence template (“Based on pickup images, vendor images… Set Result to …”)
- clean_string() – special chars, extra spaces clean
- fix_verification_questions(questions_list) – har question ke liye Type/ValueToCheck/Label dekh ke Result + Evidence instruction set karta hai (boolean/numeric/text/category + damage/usage special case); Confidence “0-100 percentage…”, isRetakeImage “True if… False if…”
- generate_question_template(…) – vendor/pickup image count, product description, question list use karke full prompt banata hai: image placeholders (Image-1 … Image-N), vendor vs pickup ranges, strict instructions, aur fix_verification_questions ka output + JSON schema jaisa format (qt) embed karta hai
Yahi prompt model ko diya jata hai.
15. Model Load & Inference (1172–1396)
- load_model() – CUDA check; AutoModel.from_pretrained (bfloat16, flash_attention_2, low_cpu_mem_usage); torch.compile(mode=“reduce-overhead”); tokenizer; model_state mein set; fail pe ProcessingError 503
- run_model_inference(pixel_values, questions, num_patches) – generation_config use karke model.chat(tokenizer, pixel_values, questions, num_patches_list=num_patches, …); inference_mode + cuda stream; sync; responses return
- save_to_database(data_dict, table_name) – async;
rvpxbverification_failho to drop_if_exists=True, else simple insert; conn.insert_single_dict_simple - save_to_database_error() – create_table_from_dict + insert (error/fail table ke liye)
- lifespan(app) – startup: load_model, start_prefetch_pipeline; shutdown: stop prefetch, session close, thread pools shutdown, model delete, cuda cache clear
16. String Match & Time (1398–1426)
- find_most_similar_string(main_string, list_of_strings, threshold=90) – fuzzywuzzy (token_set_ratio) se list mein se sabse similar string; score >= threshold pe return (string, score), warna (None, 0). Use: vendor question labels vs model response labels match karne ke liye.
- get_current_time() – UTC → IST (Asia/Kolkata), format “YYYY-MM-DD HH:MM:SS”
17. Main Processing – process_request_with_prefetch (1428–1736)
Ye main flow hai:
- Start – start_time, request_id, data_to_save = request.model_dump(), timestamp; background_tasks.add_task(save_to_database, data_to_save, ‘rvpxbverification_request_data’) – request pehle hi DB mein save ho jati hai.
- Model check – model loaded nahi to ProcessingError 503.
- Semaphore – async with get_semaphore() se concurrency limit.
- Prefetch try –
get_prefetched_data(timeout=0.05)se dekhta hai koi pehle se preprocessed data processed_queue mein hai.- Mil gaya – vendor_tensors, pickup_tensors wahi use; counts set.
- Nahi mila – same request ke liye: session se vendor URLs download (rewrite_s3_to_cloudfront), pickup decode (base64/URL), dono ko load_image se tensors; phir submit_for_prefetch(request, request_id) se agla koi isko prefetch use kar sake (optional).
- Validation – vendor_img_count ya sr_images_count 0 ho to ProcessingError 400.
- Model input – img_tensors = vendor + pickup; num_patches list; process_verification_data(request.question) se question_data; generate_question_template(…) se questions string; pixel_values = sab tensors concat, bfloat16, cuda.
- First inference – run_model_inference(pixel_values, questions, num_patches); extract_json(responses) se result; validate_json_schema(result).
- First inference data – uuid, shippingid, previous_request, Question (prompt), First_inference (result), First_response (raw), Second_* empty, timestamp.
- Agar validation fail – retry with extra line in prompt (confidence numeric, JSON schema); second inference; result again validate.
- Dobara fail – second_inference_data save in rvpxbverification_fail, phir ProcessingError 500.
- Pass – first_inference_data ko rvpxbverification_fail mein background save (retry success record).
- Recommendation nikalna – result se pickup_confidence.recommendation (Accept/Reject/Partial/Retry) → ai_recommendation (Pick / Not Pick) aur StatusCode (200/202). Unknown to ProcessingError.
- Question matching – verification_results se response labels vs question_data_copy (vendor labels). Agar sab labels match + sab ValueToCheck response mein hai to force Pick.
- Missed questions – agar response labels != vendor labels: find_most_similar_string se har vendor label ka best match response label; match mila to Result/Evidence/Confidence/isRetakeImage copy; nahi mila to missed_question, type ke hisaab se Result="" ya opposite_boolean, Confidence=0, isRetakeImage=“true”.
- Question2 – question_data_copy ko response format mein: Result → answer, boolean fix (fix_bool), IsMandatory string, isRetakeImage lower.
- analysis – result.image_analysis se modified_analysis_data (test_product_description, reference_product_description, comparison).
- Pick/Not Pick logic – Agar ai_recommendation ‘Pick’: sab answer ko ValueToCheck se set; ‘Not Pick’: text/number type mein ValueToCheck != answer ho to answer ”.
- Response dict – shippingid, uuid, product_id, processing_time_ai, Question (Question2), ai_decision, finalAction, analysis, ai_match_score, status, StatusCode, timestamp, total_time.
- DB save – db_dict = response + key rename (Question→question, etc.) + result, prompt, vendorquestion, validation_status, is_latest, created_date, performance_metrics; background_tasks.add_task(save_to_database, db_dict, ‘rvpxbverification’).
- Return – response_dict turant return (immediate response).
- Errors – ProcessingError re-raise; koi aur exception → create_error_response jaisa ProcessingError 500.
18. Middleware (1738–1764)
RawRequestLoggingMiddleware – har request ka body read karke log (path, method, body); body ko receive() se wapas attach karta hai taaki FastAPI dubara parse kar sake. Debugging ke liye.
19. FastAPI App & Endpoints (1766–1898)
- app – FastAPI(lifespan=lifespan), title “InternVL v2.5”, version 3.1.0.
- Middlewares – RawRequestLoggingMiddleware, CORS (config se), TrustedHostMiddleware.
- GET / – service name, status, version, features, max_concurrent_requests, prefetch_workers, caching_enabled=False.
- GET /health – model_loaded, cuda, GPU memory, concurrency (active/max), prefetch pipeline (running, workers, queue sizes), cache_enabled=False.
- POST /process_request – RequestItem leta hai, BackgroundTasks; process_request_with_prefetch call; result return; ProcessingError → create_error_response + HTTPException; koi aur exception → 500 + same error format.
- GET /metrics – model, cuda, concurrency, prefetch stats, cache_size=0, performance flags, timestamp.
20. Exception Handlers (1900–1940)
- ProcessingError – create_error_response, JSONResponse with exc.status_code.
- HTTPException – agar detail mein pehle se errorMsg wala dict hai to wahi, warna create_error_response se standard format.
- Exception – generic “Internal server error”, 500, same format.
21. Main Entry (1942–2058)
if __name__ == "__main__": uvicorn.run(app, host, port, log_level, access_log=True, reload=False, workers=1).- Logs: immediate response, max concurrent, thread pool, caching disabled.
Flow Ek Line Mein
Request aati hai → raw body log → validate (Pydantic) → request DB mein save (background) → semaphore acquire → prefetch se ya sync images download/decode → tensors + question template → model inference → JSON extract + schema validate → retry if invalid → recommendation (Pick/Not Pick) + question matching + missed questions handle → Question2 + analysis → response dict + result DB save (background) → response return → errors standardized format mein.
Agar kisi specific function ya block (e.g. fix_verification_questions, prefetch worker, DB tables) ka aur detail chahiye ho to bata dena, usi hisaab se aur Hinglish mein break-down kar dunga.