• duuyidong@gmail.com

Blog

2023, This is not the worst

Since 2016, I have been writing a yearly review blog for myself, and I haven’t stopped since. However, this year’s review is a bit delayed due to fatigue and being busy with responsibilities. Mortgage, being a father, and joining a new team, things are pushing me move forward, it feels like a drowning person trying to catch something to stay afloat, I felt tired and depressed. Like the old words said, adult word never easy, you only learn it when you’re.



Read more

Producer-Consumer Pattern with Non-Blocking Queue

Metrics are vital for the distribution system, this article describes how to implement a metric function for a high TPS system.

The code can be found here: https://github.com/ADU-21/producer-consumer



Read more

Troubleshooting in a Large System

Our server has 75M daily TPS which generates a lot of ops work, I’m often asked “What do you guys do when you’re on call?” this blog will demon straight how we do troubleshooting in a distributed system.



Read more

Throttling in Distributed System

Throttling is one of the three effective methods for protecting a high concurrency system. The other two are respectively caching and downgrading. Throttling is used in many scenarios to limit the concurrency and the number of requests. Our service has tens of millions of TPS, with tens of thousands of hosts serving traffic. Throttling is vital for such a large distributed service.



Read more

Working expeirence at Alibaba vs Amazon as engineer

Util today I’ve been working at Amazon for 1 year and 9 months, almost same as I have been working at Alibaba, I was lucky enough to go through the full project cycle at both companies as an engineer, I’ll describe the the fact and thoughs based on my expeirence and comparing those 2 companies.



Read more

Hello, 2022!

May he find undeserved bliss whatever he goes.

–The Man from Earth



Read more

Speed Up Your AWS S3 Client

Our team recently had a performance issue with some data processing. Every day we have 24 files of 30GB generated in S3, and we’re having a Fargate cluster to download and process those data, it takes 12 hours to processing all 600+GB files, which is too slow as we want to increase the size of file for processing. After a serial of improvement, we successfully reduce the processing to 1.5 hours.

This is a sample project to explain what improvement we’ve done: https://github.com/ADU-21/s3-parallel-download



Read more

AWS DynamoDB Study Note

DynamoDB was announced by Amazon CTO Werner Vogels on in 2012, 14 years after NoSQL was proposed in 1998. It supports key-value and document-oriented structure storage.



Read more

AWS Step Function

Our client recently has a deployment system that has been in use for more than 10 years and wants to migrate to the cloud, this blog shows how we migrate it step by step from a huge single application to serverless by AWS Step funtion.



Read more

OS Memory Management

In Computer Systems, the CPU is much faster than the storage system, so ideally we want storage system read/write as fast as possible, unfortunately, the price of storage media increases exponentially with the access speed. Thus, in order to balance cost and performance, we designed a multi-layer memory hierarchy.



Read more

CPU Scheduling and Deadlock

CPU Scheduling is a process of determining which process will own CPU for execution while another process is on hold. The main task of CPU scheduling is to make sure that whenever the CPU remains idle, the OS at least selects one of the processes available in the ready queue for execution.



Read more

Inter-Process Communication(IPC)

In order to improve CPU utilization, and more effectively manage jobs in a multiprogramming system, the operation system(OS) abstracts the “process”. Each process has a Process Control Block(PCB*) which contains all information that OS required to manage the process, and resources in user space memory, they’re generally independent of each other, but the kernel space is shared, so communication between processes must go through the kernel.



Read more

AWS Data Analytics

Big things, fast, minimal set up, maximum security, low cost.



Read more

Time Complexity of Java Collections

Collections data structure and their operations’ time complexity are important fundamental knowledge for becoming a better developer.



Read more

How to Code Python

Python is my favorite language because it’s simple and fast. It has been 3 years since first time I learn python, I’ve wrote a lot of article about python inital in my first year coiding python, feel free to read it if you understand chinese well.

this article will focus on engineering practice of python and include an example about how to build a tool with python.



Read more

My Name is Luke, come from a story

It has been long time, haven’t think about Jackie Chan and the movice I have been made with him. the Skiptrace, Can’t see it’s a very successful movie, but it’s do help me growth every much.



Read more

Welcome To Hexo

Welcome to Hexo! This is your very first post. Check documentation for more info. If you get any problems when using Hexo, you can find the answer in troubleshooting or you can ask me on github.


Read more