Operation And Maintenance Engineer

Best Web3

26.2-31.9Kر.س[月給]
オンサイト - クアラルンプール1年以上3年未満の経験大卒正社員
共有

応募条件は マレーシア人, パスホルダー - EP/TEP/PV/DP/RP-T/その他, 外国人

職務内容

Job Responsibilities

I. Infrastructure and Server Operations (Core Responsibilities)

  • Responsible for the architecture design, setup, and optimization of the company's server clusters (OCI / AWS).
  • Manage Linux servers, system environments, user permissions, SSH keys, SFTP, Firewall, and Security Groups.
  • Responsible for Nginx, SSL, reverse proxy, domain name, and certificate management, maintaining high availability and security.
  • Maintain virtual machines, load balancers (LB), object storage, VPC/VCN networks, subnets, and security group policies.
  • Troubleshoot production environment issues: port conflicts, permission errors, service startup failures, full disks, network anomalies, etc.

II. CI/CD and Deployment Management

  • Design, build, and maintain CI/CD pipelines (GitHub Actions / GitLab CI / Jenkins).
  • Write and maintain deployment scripts, automated build scripts, environment variable management, and version release processes.
  • Responsible for deployment strategies, rollback strategies, blue-green deployments, and canary deployments in testing/UAT/production environments.
  • Collaborate with the R&D team for daily releases, emergency fixes, and configuration management.

III. System Stability and Availability (SRE Focus)

  • Establish an application monitoring system (Prometheus, Grafana, ELK, CloudWatch).
  • Responsible for building an alerting system: CPU/Memory/Disk, service anomalies, and interface anomalies.
  • Responsible for the formulation and implementation of SLAs, SLOs, and SLIs to improve system stability.
  • Perform regular capacity planning, performance optimization, and system load testing.

IV. Security and Access Control

  • Manage server accounts, cloud platform accounts, Git repository permissions, and Jira/Wiki system permissions.
  • Build/maintain bastion hosts (Jump Server/Bastion), adhering to the principle of least privilege.
  • Write security baseline policies and regularly perform patch upgrades, vulnerability scanning, and security inspections.
  • Cooperate with the security/risk control team to handle security incidents (brute-force attacks, abnormal traffic, service vulnerabilities, etc.).

V. Database and Middleware Maintenance

  • Maintain the deployment, backup, and master-slave configuration of services such as MySQL, PostgreSQL, Redis, and Kafka.
  • Database performance tuning, slow SQL analysis, and connection pool optimization.
  • Implement backup strategies, automatic backups, off-site disaster recovery, and regular recovery drills.

VI. Documentation and Asset Management

  • Maintain server ledgers, domain certificate ledgers, and permission lists.
  • Write and maintain operation and maintenance documentation: deployment instructions, deployment processes, security policies, and architecture diagrams.
  • Manage operation and maintenance assets: server specifications, monitoring panels, keys, environment configurations, and network topology diagrams.

VII. Team and Process Development

  • Responsible for the daily management and training of the operation and maintenance team.
  • Drive the implementation of production change processes, deployment procedures, permission management procedures, and disaster recovery procedures.
  • Coordinate across teams (R&D, backend, DBA, and security teams) to handle emergency failures.

Job Requirements

  • Proficient in Linux system administration, Shell scripting, and network basics (Layer 3/Layer 4/Layer 7).
  • Familiar with cloud platform operation and maintenance: OCI/AWS.
  • Proficient in Nginx, SSL, reverse proxy, Keepalived, and load balancing.
  • Familiar with Docker/Kubernetes (at least Docker + Compose must be proficient).
  • Familiar with CI/CD pipelines (GitHub Actions / GitLab CI / Jenkins).
  • Proficient in MySQL basics, master-slave replication, backup and recovery, and performance optimization.
  • Familiar with at least one commonly used middleware such as Redis, Kafka, or RabbitMQ.
  • Experience in building monitoring systems: Prometheus / Grafana / ELK / Loki.
  • Bonus points: Strong logical thinking and rapid troubleshooting abilities; able to independently handle online incidents.
  • A complete operational system mindset: monitoring, alerting, security, permissions, and processes.
  • Excellent documentation skills; able to organize asset tables, network topology, and process procedures.
  • Strong communication and cross-team collaboration skills.
  • Experience in operations and maintenance in the financial, exchange, and blockchain industries.
  • Familiar with high-concurrency and high-availability architecture design.
Monitoring Tools (PrometheusGrafana)Database Management (SQLMongoDB)Infrastructure as Code (TerraformAnsible)Cloud Services (AWSAzureGCP)LinuxCI/CDGitシェルスクリプト
Preview

Claire 12

HR经理Best Web3

3日以内にオンライン

掲載日 21 January 2026

報告する

Bossjobの安全に関する注意事項

海外勤務をお考えの際は下記の事項に注意して下さい。まずパスポートなどの身分証明証は不必要に提示しない。

そして下記に該当する企業を見つけた際は、 直ちに報告をお願いいたします。

  • 保証や担保を要求する会社には注意
  • 投資や資金調達を勧誘する
  • 不当な利益を得ていると思われる企業
  • 違法と思われる状況
  • その他不審に感じた場合
Tips
×

Some of our features may not work properly on your device.

If you are using a mobile device, please use a desktop browser to access our website.

Or use our app: Download App