All Articles

| 11.17.22

Senior Software Engineer (Web Crawling)

All Jobs


17/Nov/2022

Senior Software Engineer(Web Crawling)

(No. of Positions-1)

Location - Noida, India | Full-Time



Job Summary

Senior Software Engineer (Web Crawling) will be primarily responsible for improving the existing data extraction\web scraping tools/workflows and enhance/develop web crawling algorithms as per latest technologies\trends. He should have continuous focus on identifying and fixing the scrapping related issues and scaling the scrapers to download huge volumes of data.


  • Education Requirements

    BE/B.Tech/MCA Degree in Computer Science, Engineering, or similar relevant field

  • Salary

    Not Disclosed

  • Benefits/Hours

    Medical Coverage, Accidental Insurance, Life Insurance, Eligibility to Gratuity after 5 years of continuous service, Paid Time Off, Paid Holidays and Floating Holidays


Role/Responsibilities

  • Architecture design and research and development of distributed crawler and data acquisition system
  • Design and development of distributed crawler module service architecture and data storage architecture
  • Realization of daily network data capture requirements and quality monitoring of collected data
  • Reptile data extraction, cleaning, weight elimination, statistics, etc.
  • Optimize the crawling strategy, make full use of bandwidth resources, avoid various restrictions, and improve the crawling effect

Knowledge & Skill Requirements

  • BE/B.Tech/MCA Degree in Computer Science, Engineering, or similar relevant field
  • More than 3+ years of enterprise-level web crawler development experience
  • Work conscientiously, meticulously, and practically, have strong learning ability, take solving technical problems as fun, have ideas, and dare to challenge
  • Familiar with Linux platform, solid basic skills in Java or Python, able to design and write crawler system independently is preferred
  • Familiar with the principles and techniques of web crawling, regular expressions, multithreading, HTTP protocol, and be able to obtain information from structured and unstructured data
  • Familiar with the concepts and processes of crawling, seed, parsing, downloading, deduplication, extraction, filtering, scheduling, asynchronous processing, etc
  • Familiar with one or more open-source technologies in WebMagic / Scrapy / Heritrix / HtmlParser / Jsoup / HttpClient
  • Experience in verification anti-crawling, distributed crawler architecture, data mining, and building data warehouses is preferred
  • A background in data mining, natural language processing, information retrieval, and machine learning is preferred
  • Have good knowledge on API/Rest API development
  • Solid understanding of bash scripting.
  • Familiarity with proxy technologies.

Additional Qualifications

  • Excellent written and verbal communication skills
  • Excellent organizational and time management skills
  • Strong problem-solving and analytical skills
  • Strong attention to detail and documentation
  • Strong foundation in object-oriented programming
  • Ability to work in a team-based environment
  • Willingness to work a flexible schedule
  • Ability to work with minimal supervision in a very dynamic and timeline sensitive work environment

About Intrics & the team

Intrics is a spin-off of a 30-year market leader in the retail data space with active relationships in virtually every major North American retailer. Intrics provides intelligent solutions leveraging the winning combination of progressive technical expertise (data science, analytics, advanced engineering, etc.) and extensive retail domain knowledge.

Never before has a company been more uniquely positioned to service the retail industry. Combining a deep understanding of retail operations and strategies, and a technological expertise to leverage the massive data influx and extract the most valuable and actionable information.

Why become part of Intrics

The retail industry continues to see unprecedented dynamics associated with the pivot to a true omni-channel shopping experience. Informed retailers are succeeding and Intrics is providing the solution and insights used to make million-dollar decisions each day. With Intrics you will be at the forefront of retail innovation and a true partner to the industry.

Are you ready to contribute to solutions sought after by North America’s leading retailers? If so, let’s talk.

Let's get started

Interested in learning more about our solutions? Connect with us so we can share insights that will help drive your business. Or email us directly for a discussion.

Contact Us