AI Navigator – Large Model Cloud Inference Deployment Engineer

1 week ago


Hong Kong Island, Hong Kong SAR China SenseTime 商汤科技 Full time
AI Navigator – Large Model Cloud Inference Deployment Engineer

SenseTime Group Limited

China

On-site Part Time Internship

Skills:

C++, Python

Job Responsibilities

  • Optimize the inference deployment of large models on computing clusters, focusing on multi-node, multi-GPU parallel inference, task scheduling, KV cache management, and other techniques to enhance inference performance and reduce costs.
  • Research the latest advancements in large-model serving and integrate cutting-edge techniques into real-world business applications.

Job Requirements

  • Deep understanding of mainstream large-model algorithms and underlying principles.
  • Familiarity with large-model inference pipelines and optimization techniques such as Continuous Batching and Paged Attention.
  • Proficiency in mainstream large-model inference engines, such as ppl.llm, vLLM, TensorRT, TGI, or experience with traditional inference engines.
  • Strong software engineering foundation, familiarity with design patterns, and proficiency in C++ and Python.
  • Strong learning ability, communication skills, and the ability to articulate complex technical concepts clearly.
Additional InformationJob Level

Internship

Publish Date

17/02/2025

Job Ref. No.

N/A

Job Function

Research / Analysis

Research & Development (R&D)

Company Overview

SenseTime is the world's leading artificial intelligence platform company valued above USD 4.5 billion, and is the fifth national AI platform in China. Focused primarily on computer vision and deep learning, the company has independently developed a deep learning platform and a deep learning supercomputing center. With its proprietary technologies serving as its fundamental driver, SenseTime has established a R&D center, integrated with various industries, and forged partnerships across the board to create an AI ecosystem. With offices in Hong Kong, Beijing, Shenzhen, Shanghai, Chengdu, Hangzhou, Kyoto, Tokyo and Singapore, SenseTime has attracted top talents around the world to build a world-leading technology company.

#J-18808-Ljbffr
  • AI Model Engineer

    8 hours ago


    Hong Kong, Central and Western District, Hong Kong SAR China Pantheon Lab Limited Full time

    Pantheon Lab Limited is a leading Generative AI company specializing in digital human technologies and advanced digital assistant solutions.Job DescriptionWe are seeking a highly skilled Ai Model Engineer to join our team. As an Ai Model Engineer, you will be responsible for building large-scale end-to-end machine learning systems for APAC projects and...


  • Hong Kong, Central and Western District, Hong Kong SAR China Pantheon Lab Limited Full time

    As a Generative AI Developer at Pantheon Lab Limited, you will be part of a dynamic team that is revolutionizing the field of digital human technologies and advanced digital assistant solutions. We are seeking an experienced professional to build large-scale end-to-end machine learning systems for APAC projects and company products related to Digital Humans...


  • Hong Kong, Central and Western District, Hong Kong SAR China ChainOpera AI Full time

    ChainOpera AI is the world's first truly decentralized and open AI platform for simple, scalable, and trustworthy collaborative AI economy, and the AI app ecosystem for accessible and democratized AI - our GPUs, our model, our personal AI. ChainOpera AI is supported by: Enterprise-level generative AI platform for system scalability, model performance, and...

  • AI Model Engineer

    4 days ago


    Hong Kong, Central and Western District, Hong Kong SAR China Pantheon Lab Limited Full time

    We are looking for an AI Model Engineer to join our team at Pantheon Lab Limited.The ideal candidate should have a Bachelor's degree or above in Computer Science, Artificial Intelligence, Machine Learning, or a related field.You will be responsible for conducting research to explore new techniques and methodologies in generative AI.As an AI Model Engineer,...


  • Hong Kong, Central and Western District, Hong Kong SAR China Pantheon Lab Limited Full time

    About the Role:">We are seeking a talented Generative AI Developer to join our team at Pantheon Lab Limited and contribute to the development of digital human technologies and advanced digital assistant solutions.">Key Responsibilities:">">Develop and deploy large-scale end-to-end machine learning systems for APAC projects and company products related to...

  • AI Model Developer

    7 days ago


    Hong Kong, Central and Western District, Hong Kong SAR China Pantheon Lab Limited Full time

    Pantheon Lab Limited is a leading Generative AI company specializing in digital human technologies and advanced digital assistant solutions. We are seeking an experienced Generative AI Developer to join our team and contribute to the development of large-scale end-to-end machine learning systems for APAC projects and company products related to Digital...


  • Hong Kong, Central and Western District, Hong Kong SAR China ChainOpera AI Full time

    ChainOpera AI is a pioneering decentralized and open AI platform, empowering collaborative AI economies and democratized AI ecosystems.The company is backed by enterprise-level generative AI, open-source libraries, innovative edge-cloud models, internet veterans, and established researchers in blockchain, machine learning, and distributed systems.We're...


  • Hong Kong, Central and Western District, Hong Kong SAR China beNovelty Limited Full time

    We are looking for a talented Large Language Model Designer to join our team at beNovelty Limited. In this role, you will be responsible for designing effective prompts for large language models to solve complex real-world problems. You will work closely with our AI and product teams to develop, fine-tune, and optimize prompts for large language models...

  • AI Prompt Engineer

    7 days ago


    Hong Kong, Central and Western District, Hong Kong SAR China beNovelty Limited Full time

    We are seeking a highly skilled AI Prompt Engineer to join our team at beNovelty Limited, a leading API technology company. As a key member of our team, you will play a crucial role in designing effective prompts for large language models to solve complex real-world problems.About the RoleAs an AI Prompt Engineer, you will work closely with our AI and...


  • Hong Kong, Central and Western District, Hong Kong SAR China ChainOpera AI Full time

    We're ChainOpera AI, a leader in the decentralized AI ecosystem. Our platform offers a unique opportunity for developers to work on cutting-edge projects and collaborate with top-notch researchers.This role requires a highly skilled software engineer who can design, develop, and maintain our AI platform and applications. You'll work closely with our research...


  • Hong Kong, Central and Western District, Hong Kong SAR China Pantheon Lab Limited Full time

    Pantheon Lab Limited is at the forefront of Generative AI, specializing in digital human technologies and advanced digital assistant solutions. As a Deep Learning Specialist, you'll join our team of innovators to develop and deploy cutting-edge AI models that transform real-world interactions.Your Key Responsibilities:Develop and train large language models...


  • Hong Kong, Central and Western District, Hong Kong SAR China VisionMatrix Technology Limited Full time

    Responsibilities: 1. Infrastructure Management Design scalable cloud infrastructureConfigure distributed computing resourcesOptimize GPU/compute allocationManage AWS services (SageMaker, EC2)Implement cost-effective training solutions2. Training Pipeline Development Create end-to-end ML workflowsImplement CI/CD for machine learningDevelop automated training...

  • AI Prompt Engineer

    7 days ago


    Hong Kong, Central and Western District, Hong Kong SAR China beNovelty Limited Full time

    We are seeking a highly creative and analytical AI Prompt Engineer to join our team and help us design effective prompts that enable our AI systems to perform at their best. You will be working closely with our AI team to develop, fine-tune, and optimize prompts for large language models across a wide range of tasks. This role is essential to unlocking the...


  • Hong Kong Island, Hong Kong SAR China RAISOUND (HONGKONG) CO., LIMITED Full time

    We are seeking a Large Model Research Scientist to join our team at RAISOUND (HONGKONG) CO., LIMITED. This role will involve developing and researching large models, especially multi-modal models, to support various business applications.Key ResponsibilitiesResearch and develop large models to support different business scenarios.Collaborate with software...


  • Hong Kong, Central and Western District, Hong Kong SAR China Rapport AI Medical Full time

    About the RoleRapport AI Medical is seeking an experienced Software Systems Analyst / Project Manager to join our young team and contribute to our growing healthcare AI platform.The ideal candidate will have a background in software development and AI/ML, with experience in SaaS development, verification, validation, and deployment, as well as cloud...

  • AI Model Architect

    2 days ago


    Hong Kong Island, Hong Kong SAR China RAISOUND (HONGKONG) CO., LIMITED Full time

    The position of AI Model Architect at RAISOUND (HONGKONG) CO., LIMITED involves designing and developing large models, particularly multi-modal and other big models, to support various business applications.Main ResponsibilitiesDesign and develop large models to support different business scenarios.Collaborate with software and hardware R&D teams to provide...


  • Hong Kong, Central and Western District, Hong Kong SAR China RAISOUND (HONGKONG) CO., LIMITED Full time

    Job OverviewWe are seeking a highly skilled Senior Large Model Researcher to join our R&D team in Hong Kong. The ideal candidate will have extensive experience in developing and deploying large models, particularly in areas such as multi-round dialogue, question answering technology, and knowledge systems.ResponsibilitiesDevelop and implement large models...


  • Hong Kong Island, Hong Kong SAR China Intellectsoft Full time

    ResponsibilitiesThe successful candidate will be responsible for:- Designing the architecture for the open-source-based data analytics platform;- Developing scalable data models, data pipelines, and data lakes;- Ensuring integration of various data sources, including Kafka, NiFi, Apache Airflow, and Spark;- Implementing modern data platform components like...


  • Hong Kong, Central and Western District, Hong Kong SAR China Pantheon Lab Limited Full time

    At Pantheon Lab Limited, we are committed to pushing the boundaries of digital human technologies and advanced digital assistant solutions. As a Generative AI Developer, you will play a crucial role in building large-scale end-to-end machine learning systems for APAC projects and company products related to Digital Humans and Gen AI solutions.About the...

  • AI Model Developer

    8 hours ago


    Hong Kong, Central and Western District, Hong Kong SAR China Pantheon Lab Limited Full time

    **Job Description:**We are seeking a skilled AI Model Developer to join our team at Pantheon Lab Limited. As a key member of our product development team, you will be responsible for designing, developing, and implementing generative AI models for various APAC projects.**Key Responsibilities:**Design and develop generative AI models, such as GANs, VAEs, and...