As a tier 3 RPHY engineer on the Ops team, I was directly responsible for the RPD (~120,000) units within the NGAN (Next Generation Access Network) vCMTS network. Because of this, I became very adept at spotting physical issues along the transmission path, this included analyzing Docsis carriers on the signal level for any noise issues, missing QAMs, etc. Amongst this I also gained a broad understanding of the features of vCMTS and the different virtualized aspects of that go on under the hood.
Because I love python and automation, I set out on the endeavor of upgrading my team's tooling and created a Pip installable, command line tool. This tool included abstract base models of an RPD object which allowed for the different vendors and model types to be implemented easily. Also developed was a prometheus asynchronous querying tool that could be spooled up or down to reach ~200 requests per second.
It's worth noting that many devops concepts were included with this package on the development side, such as: symantic version, CI/CD pipeline utilizing GHA with unit tests & linting, automatic deployment to GitHub Pages, and auto tagging and artifact generation. I also utilized Sphinx to auto generate documentation for all docstrings within the functions / classes / methods that I created.
Comcast automates as much as they possibly can, I was directly involved in this process. I would create alerts utilizing their telemetry stack and remediations that would hopefully 'self-heal' any easily caught situations. Amongst that I also was responsible for monitoring and reporting on their deployment Ansible pipelines.
My team was directly responsible for the development of the firmware on Spectrum's CPE routers. This created a strong familiarity with many routing concepts as well as Linux concepts. This also allowed me to develop a feel on what a tech stack looks like.
In order to deploy our above developed firmware we needed an automated testing environment that would enable the developers to quickly test their code on a real physical device. That's where the integration lab came in. We deployed ansible configured Gitlab runners on RPI devices networked together that would run a python test suite on an actual Spectrum router. This allowed the developers to be able to run nightly cron testing on their code as well as giving them the ability to instantly test their code.
The above created test setups would eventually go down so we needed a way to monitor them. Utilizing PostgresQL and Flask we created an in-house monitoring system and health check system that would alert us when a setup was deemed "unhealthy". This same tool was also pip installable so that on demand health checks could be run at command line. This created more productive engineers within the lab as they no longer had to wait for tests to fail in order to see when a setup was deemed unhealthy.