京东物流运维港主要是负责物流的,每天运作,还有就是查询一些异常的物流和解决一些物流上的问题。
一般来说,运维工程师都是说的互联网企业的运维师,通常属于技术部门,是支持互联网产品技术以及研发,测试和系统管理的四个主要部门。国内外公司以及大型和小型公司之间的划分将有所不同,主要的工作内容有下面几种:
1、保障业务系统长期稳定运行
毕竟业务系统要是出现一点差错,用户就要投诉,所以运维工程师最核心的工作就是保证业务系统能够稳定运行。
首先要知道业务跑在什么上,一般来说网站服务器都是nginx、apache等,依赖mysql数据库进行数据储存,依靠PHP进行解析,所以运维工程师必须掌握LNMP、LAMP等环境部署的知识。
2、保障数据安全可靠
数据安全是公司领导最看重的部分,运维工程师也要保证数据的安全性和可靠性,要是出了一点点错误,领导就要找运维喝茶了。
有时候需要手动改数据库的内容,就要学会掌握Mysql数据库的增删查改知识;
有时候需要应对数据库的服务器硬件坏了,就需要Mysql主从复制以备不时之需;
有时候需要还原数据库,就需要学会mysql增量备份和恢复,以还原到指定的时间点;
有时候定时备份还不够,就需要使用rsync+inotify来实时备份;
有时候为了增加服务器安全性,就要通过iptables来控制公司的IP或者跳板机IP访问权限;
3、构建监控报警体系
运维工程师常用的是zabbix、nagios来进行报警监控,如果没有监控运维就是瞎子,所以要先构建报警监控体系,此后就要解决系统故障。
一般来说,常见的故障有应用故障、数据库故障、网线故障等等,有的是软件故障,有时候是硬件故障,而一个有经验的运维工程师能在第一时间定位故障原因。
4、技术与业务问题处理
这里有两个核心的问题,分别是技术问题和业务问题,技术问题主要需要网络抓包分析、tcpdump抓包分析和代理机制等等内容;
而业务问题就比技术要复杂一些了,比如业务层面的数据分析,不光要统计出业务的各种指标数据,还要对数据进行分析解剖,找出业务问题的所在。
5、版本测试与上线
这也是运维工程师的常见工作内容,负责版本的测试与上线,开发人员发布版本之前,运维工程师需要进行性能和功能测试;此外在版本上线的时候,最好也在晚间业务量较小的时候上线,可以避免上线压力过大。
总结
运维和开发是两个截然不同的方向。如果做运维的话,有开发的底子那么转岗位也不是不可以。
运维负责具体的产品线运维工作,同时也需要掌握开发的能力,深入业务,最了解业务的痛点和问题,同时研发/优化针对产品业务需求的平台、工具和手段,能够接触到各类优秀的系统架构并有能力做出优劣对比,同时对业务的掌控决定了相应运维工程师在业务发展中的作用。
JD Logistics Operation and Maintenance Port is mainly responsible for logistics, daily operations, and querying some abnormal logistics and solving some logistics problems.
Generally speaking, operation and maintenance engineers are the operation and maintenance engineers of Internet companies. They usually belong to the technical department, which supports Internet product technology and research and development, testing and system management. The division between domestic and foreign companies and large and small companies will be different. The main work content is as follows:
1. Ensure the long-term stable operation of the business system
After all, if there is a little error in the business system, users will complain, so the core work of the operation and maintenance engineer is to ensure that the business system can run stably.
First of all, you must know what the business runs on. Generally speaking, website servers are nginx, apache, etc., relying on mysql database for data storage and PHP for parsing, so operation and maintenance engineers must master the knowledge of environment deployment such as LNMP and LAMP.
2. Ensure data security and reliability
Data security is the most important part for company leaders. Operation and maintenance engineers must also ensure data security and reliability. If there is a little mistake, the leader will have to ask the operation and maintenance for tea.
Sometimes you need to manually change the content of the database, you must learn to master the knowledge of adding, deleting, checking and modifying the Mysql database;
Sometimes you need to deal with the database server hardware failure, you need Mysql master-slave replication for emergency use;
Sometimes you need to restore the database, you need to learn mysql incremental backup and recovery to restore to a specified time point;
Sometimes scheduled backup is not enough, you need to use rsync+inotify for real-time backup;
Sometimes in order to increase server security, you need to use iptables to control the company's IP or jump server IP access rights;
3. Build a monitoring and alarm system
Operation and maintenance engineers often use zabbix and nagios for alarm monitoring. If there is no monitoring operation and maintenance, you are blind, so you must first build an alarm monitoring system, and then solve system failures.
Generally speaking, common faults include application faults, database faults, network cable faults, etc. Some are software faults, and sometimes hardware faults. An experienced operation and maintenance engineer can locate the cause of the fault in the first place.
4. Handling of technical and business problems
There are two core issues here, namely technical issues and business issues. Technical issues mainly require network packet capture analysis, tcpdump packet capture analysis, proxy mechanism, etc.;
Business issues are more complicated than technology. For example, data analysis at the business level not only requires statistics of various business indicators, but also requires analysis and dissection of the data to find out where the business problems are.
5. Version testing and online
This is also a common work content of operation and maintenance engineers. They are responsible for version testing and online. Before developers release the version, operation and maintenance engineers need to perform performance and function testing; in addition, when the version is online, it is best to go online at night when the business volume is small to avoid excessive pressure on the online version.
Summary
Operation and maintenance and development are two completely different directions. If you are doing operation and maintenance, it is not impossible to change positions if you have a foundation in development.
Operation and maintenance is responsible for the operation and maintenance of specific product lines. At the same time, you need to master the ability of development, go deep into the business, understand the pain points and problems of the business, and develop/optimize the platforms, tools and means for product business needs. You can access various excellent system architectures and have the ability to compare the advantages and disadvantages. At the same time, the control of the business determines the role of the corresponding operation and maintenance engineers in business development.