Dive into the world of backend development with our live coding course, where you’ll learn how to build the backbone of a publishing platform application akin to Medium.com. This course will utilize a powerful stack including NestJS, TypeORM, PostgreSQL, and TypeScript to construct a robust backend.
Throughout the sessions, we will focus exclusively on backend development. Although we won’t be constructing the frontend from scratch, we will integrate with pre-existing frontend code to demonstrate how the backend functions within a full application context.
In today’s digital age, music streaming services have become an integral part of our daily lives. Among them, platforms like Spotify lead the pack with their vast libraries and intuitive user experiences. Designing such a service involves a complex interplay of algorithms, databases, and networking. This article delves into the core components required to build a scalable and reliable music streaming service.
Core Features of a Music Streaming Service
Before diving into the system design, let’s outline the fundamental features our service will offer:
Music Playback: Allows users to stream music tracks.
Search Functionality: Enables users to find songs, albums, artists, and playlists.
User Account Management: Supports user registration, authentication, and profile management.
Playlist Creation and Management: Users can create, share, and edit playlists.
Recommendation Engine: Suggests music based on listening habits.
Initial Phase: Base Version
Requirements:
Users: 1 million, who plays the songs
Songs: 20 million
Artists: 100,000, who uploads the songs
System Architecture Overview
Designing a service capable of handling millions of concurrent users requires a robust architecture. Below is a simplified overview:
App: Web or mobile app through which users will interact with the music streaming service
Web Servers: Handle API requests such as user authentication, metadata retrieval, and search queries.
Load Balancers: Distribute incoming requests evenly across a network of servers to prevent any single point of overload.
Application Servers: Process business logic, including playlist management and music recommendation algorithms.
Database Storage: Store structured data such as: user data, music metadata, playlists etc is stored using SQL since SQL allows for complex and faster queries and manage relationships.
Blob Storage: The song files will be stored in a Blob (Binary Large Object) storage service ex: Cloudflare R2, AWS S3, Google Cloud Platform, Azure Blob Storage etc which are meant for storing large unstructured data. This allows for efficient storage and retrieval.
CDN (Content Delivery Network): Distributes music files globally to minimize latency during music playback using a CDN service ex: Cloudfront / Cloudflare.
Cache Layer: Improves data retrieval performance by temporarily storing frequently accessed data. We can use the LRU (Least Recently Used) caching strategy to cache the popular songs while the unpopular songs will be cached on demand. Usually this is implemented by CDN service providers
Storage Estimation
Song Storage
Assumption: Average song size is 3MB.
Calculation: With 20 million songs, the total storage needed is 3MB×20,000,000=60,000GB or 60TB
Song Metadata Storage
Assumption: Average metadata size per song is about 100 bytes.
Calculation: For 20 million songs, 100 bytes×20,000,000=2GB
User Metadata Storage
Assumption: On average, 1KB of data per user.
Calculation: For 1 million users, 1KB×1,000,000=1GB
Key Components Explained
1. Data Model – SQL Database Structure
User table:
Song table:
Artist table
Relationships: We have joined the Artist and Song Tables, where we will have the artistID (Foreign key pointing to the Artist Table) and SongID (Foreign key pointing to the Song Table). From there, we can get the song metadata, which will also contain the fileURL property, pointing to the Blob storage where the song is located.
2. Efficient Search Mechanism
Implementing a fast and accurate search feature requires indexing and a robust search algorithm. Elasticsearch is a popular choice for this purpose, given its scalability and speed.
3. Personalized Music Recommendation
Machine learning algorithms analyze user listening habits to provide personalized music recommendations. This involves processing large datasets to identify patterns and preferences.
Putting it all together – Initial Phase
Scaled Phase: Expansion to 500 Million Users
Requirements
Users: 500 million
Songs: 100 million
Artists: 11 million
Scaling to half a billion users and expanding the music library tenfold presents significant challenges. The architecture must not only support increased load but also maintain, if not improve, the quality of service.
Scaling Strategies
Microservices Architecture: Breaking down the application into microservices allows for easier scaling and maintenance.
Advanced Load Balancing: Implementing more sophisticated load balancing techniques to distribute traffic efficiently across servers worldwide.
Global Server Load Balancing (GSLB): Distributes traffic across multiple data centers based on location, improving speed and reliability.
Layer 7 Load Balancing: Makes routing decisions based on the content of HTTP/HTTPS headers, allowing for intelligent traffic distribution.
Content-Aware Load Balancing: Routes requests based on content type or user behavior, optimizing resource use for different types of traffic.
Adaptive Load Balancing: Dynamically adjusts routing based on current server load and network conditions, enhancing performance.
Machine Learning-Driven Load Balancing: Uses predictive analytics to optimize traffic distribution, improving over time as it learns traffic patterns.
Scaling database with Leader – Follower technique: Now we have more users who will perform read only operations while only few artists who will do read and write. We can implement a Leader database which will perform both read and write. Leader database will have multiple follower databases which will be dedicated for read only operations.
Data Sharding and Replication with Leader – Leader technique: Segmenting the database into smaller, manageable parts (shards) to improve performance and ensure data availability.
Enhanced CDN Strategies: Utilizing multiple CDNs to reduce latency further and handle the increased traffic.
Sophisticated Machine Learning Models: Implementing more complex algorithms for the recommendation engine to handle the larger dataset and provide more accurate suggestions.
Scaled Phase: Expanded Version Storage Estimation
Song Storage
Assumption: Maintaining the average song size of 3MB.
Calculation: For 100 million songs, the required storage expands to 3MB×100,000,000=300,000GB or 300TB
Song Metadata Storage
Assumption: The metadata size remains at about 100 bytes per song.
Calculation: For 100 million songs, 100 bytes×100,000,000=10GB
User Metadata Storage
Assumption: Keeping the average at 1KB of data per user.
Calculation: For 500 million users, 1KB×500,000,000=500GB
Introduction: Connecting to GitHub or a web server requires a secure method to protect your sensitive data. While password authentication is a common approach, it can be easily compromised if the password is weak or guessable. In this blog post, we’ll explore a more secure way to connect using SSH (Secure Shell).
But for some reason, if you choose to stick with password then I would recommend that you use something like 1Password which helps you create and manage your password in a more secure way.
The Importance of Security: When it comes to the security of your GitHub page or web server, it’s crucial to prioritize strong authentication methods. While password-based authentication is relatively convenient, it may not provide sufficient protection against unauthorized access. To enhance security, it’s highly recommended to use SSH.
Understanding Encryption: Before we delve into SSH, it’s essential to understand encryption. Encryption algorithms like MD5, SHA-1, and SHA-256 can encrypt input data. However, if the encryption algorithm is known, it becomes easier for attackers to decrypt the data. To strengthen encryption, it is advisable to combine an encryption algorithm with a randomly generated salt. This combination significantly increases the difficulty of decryption, especially when the algorithm and random string remain undisclosed.
Generating SSH Keys: To establish an SSH connection, you need to generate SSH keys. Follow these steps:
Open your terminal and navigate to the SSH folder by running the following command: cd ~/.ssh
Generate the public and private keys using the command: ssh-keygen
This command will generate the public and private key files.
Configuring SSH: To instruct SSH to use your private key for every connection attempt, you need to modify the SSH config file. Here’s how:
Open the SSH config file using a text editor: vi ~/.ssh/config
Add the following content to the config file: Host * AddKeysToAgent yes UseKeychain yes IdentityFile ~/.ssh/kirandash_github
Adding Key to Apple Keychain: To streamline the SSH authentication process, you can add the private key to Apple Keychain. Follow these steps:
Run the command: `ssh-add -K kirandash_github
This command adds the private key to the Apple Keychain.
Configuring GitHub
To establish a connection with your GitHub account, you need to add the public key. Here’s how:
Display the contents of the public key by running the command: cat kirandash_github.pub
Add the copied public key to the list of authorized keys.
Conclusion: By following these steps, you can connect to GitHub or a web server securely using SSH. SSH provides a stronger authentication mechanism compared to password-based authentication, enhancing the overall security of your system. Implementing these measures will help protect your data and ensure a safer connection to your remote servers.
Remember, prioritizing security is crucial in today’s digital landscape, and SSH is an essential tool in achieving that goal.
Most of the time my team members or viewers from YouTube channel ask me about which softwares or tools I use. This post has list of almost every tools I use as developer.
Tech
React – The most widely used frontend framework in the world. Previously I used Angular. But switched to React in 2018.
TypeScript – It has helped me avoid tons of bugs for my javascript projects.
Testing Library – A great testing library for anything that interacts with the DOM. If you are still using enzyme, it’s time to switch.
In this tutorial we are going to build a COVID or The Corona Virus Disease Tracker, a modern scalable React Application using React JS framework and four advanced React tools: Redux, Thunks, Selectors and Styled Components. Just React JS is enough for creating simple applications but if you want to build large, high performance applications, your job will be much more simplified if you know how to use these additional tools.
For the Application concept, I thought I would create something that might be useful for me in the current situation. The current situation in World is not good. COVID-19 or The Corona Virus Disease has affected lives of people in many countries. I am currently staying in Singapore and have been working from home since last two weeks. As a developer I thought it would be better to use my time during this weekend to create a Tutorial to build an App for tracking Corona Virus reports from different countries.
Search Country Report: Country Codes you can use for testing: GB (United Kingdom), US (USA), SG (Singapore), GE (Georgia), IN (India), IT (Italy), ES (Spain).
Pin Country Result to move a result to very top.
Remove a Country from the list.
Persisting the result on reload as well. Note: The results are not being saved in any Database. We will save them in localStorage since our only focus here is Frontend development with React.
Prerequisites for this Tutorial:
To get the most of this tutorial, it will be better if you already know the following:
Introduction, and Project Setup: We will setup the project and understand the project structure.
Build the Application view by creating components in React JS. Important Sections:
Manage state of the application with Redux.
Handle API/Asynchronous calls with Thunks.
Selectors: A middle layer between API layer and Component View.
Styled Components: For handling CSS in a smart way, from JS file instead of creating separate CSS file.
Build app for Production deployment.
Why use the Advanced React Tools?
Every Application mainly consists of 3 important layers. API layer to get data from the APIs. A data layer where we can handle the data from API and modify it to our requirements, and last but not the least, a view layer to show the data.
React JS framework was basically designed to mainly take care of the views only. Which means React JS is powerful to show data but when it comes to the other tasks such as calling APIs and handling or managing data, although React can do the job, React is not that good. Because, React does not have any specific sets of standards on how to manage state and perform API calls etc. That is ok when we are creating a small application but if you are working on a large application with a team, each developer will have their own ways of handling the code. And thus not having a set of rules will clutter the code and it will be extremely difficult to debug the code in future. So that’s where all these tools are helpful. They provide extra sets of rules on how to do things. For example: Redux takes care of data or state management by adding some extra standard rules. Similarly Thunks have a standard way of calling the APIs. And styled components have a specific way of handling CSS.
So in summary, these extra tools help us organise the application in a much standard way by separating the responsibilities among different tools instead of handling everything with React. Thus, the Application is easy to manage and expand.
Tasks for you after Tutorial:
Create Unpin Country Button, clicking which The pinned Countries can come back to the Not Pinned Countries Section.
Create another React Component to show the Global Stats of total cases, from API: https://api.thevirustracker.com/free-api?global=stats. Use the same flow of first creating a GlobalStats.js component, add redux globalstats reducer, add selectors and finally adding styles with styled components.
Modify the reducer code to remove isLoading reducer and add isLoading as a property of state.countries instead of direct child of the state. Because now we are adding a new API. and we will need isLoading property for individual API. So also we need to add another isLoading prop for state.globalstats.