Video summary
Lec-12: Storage in Cloud 🌧️| Microsoft Azure Storage 🗃️Services with Real life examples
Main summary
Key takeaways
Main ideas / concepts covered
-
Cloud storage purpose: Cloud storage is designed to handle very large volumes of data and provide:
- Proper management
- Performance
- Scalability
- Fast backup/recovery in case of disasters
-
Cloud storage strategy (data is not stored “directly” without planning):
- When data arrives, you first determine what type of data it is.
- Azure (as an example) provides different storage services tailored to different data types and usage patterns.
-
Azure storage service categories (as described):
- (1) Object / image-video/binary-style storage: Used for data like images, videos, binary data (with different “varieties” in designed services).
- (2) File storage: Used for text files and general files that need quick sharing/access.
- Example: A multinational company where employees in different locations share files remotely via a repository.
- (3) Queue storage: Used for messaging/requests that need asynchronous processing (to handle time delays).
- Example: After a concert, a huge number of requests/messages arrive; queue storage stores them so processing can be done efficiently.
- (4) Table storage: Used for schema-less / non-relational data, described as key-value pairs.
- Example: IoT sensors (temperature/humidity) continuously generate data, stored using unique keys and corresponding values.
- (5) Disk storage: Used for structured/sequential data using specialized structure (described as “structured disk storage” for sequential data).
-
Key principle: redundancy is essential
- Data is not kept in only one place.
- Azure architecture described as:
- Region → multiple zones → multiple data centers
- Policy mentioned: at least three copies within a region (and also copies across regions).
- Rationale: ensures availability even if:
- a data center fails,
- a zone fails,
- or an entire region faces a disaster.
-
Availability / “nines”
- The lecture emphasizes extremely high availability (e.g., 999.999).
- More nines = more important and reliable.
-
Tier-based storage (hot → cold → archive)
- Not all data has the same importance/frequency.
- Data is labeled by priority (examples: highest / VVIP / VIP / common).
- Caching stores frequently requested data near where it’s needed for faster access.
- Hot tier: accessed very frequently (e.g., most recent/high-demand data).
- Cold tier: accessed less often (e.g., every 2–4 days or around 10 days).
- Archive tier: rarely accessed (e.g., not used for a year).
- Automatic movement described:
- An email from a week/month ago moves from hot → cold.
- If not accessed for ~6 months to a year, it moves to archive.
-
Storage media cost optimization
- Higher-demand data uses faster/expensive storage (e.g., SSD).
- Lower-demand data uses cheaper/denser media (e.g., HDD).
- Archived data can use tape drives / magnetic tapes to reduce cost.
-
Generalization beyond Azure
- The same conceptual approach applies to other cloud providers (AWS referenced): data-type-based storage selection, redundancy, and tiering remain core ideas.
Methodology / instruction-like steps (detailed)
-
On incoming data:
- Identify data type/variety (e.g., images/videos/binary, files/text, messages/requests, key-value style data, structured sequential data).
-
Choose the appropriate storage service based on data type:
- Use the corresponding category:
- object-style for media/binary,
- file storage for shared files,
- queue storage for async messages,
- table storage for schema-less key-value data,
- disk storage for structured/sequential data.
- Use the corresponding category:
-
Ensure redundancy by design:
- Store multiple copies:
- At least three copies within the same region,
- plus additional copies across zones and other geographic regions (intra- and inter-region redundancy).
- Store multiple copies:
-
Apply tiering based on access frequency:
- Label data by priority/importance.
- Keep frequently accessed data in hot tier.
- Move moderately accessed data to cold tier.
- Move rarely accessed data to archive tier.
-
Use caching to speed access for high-frequency data:
- Cache data on the nearest server to reduce response time.
-
Match storage media to demand to control cost:
- Use SSD for hot/high-demand data,
- hard disks for medium-demand data,
- tape drives/magnetic tapes for archive/low-demand data.
-
Aim for high availability (“nines”):
- Maintain infrastructure and redundancy so the service remains available even during disasters.
Speakers / sources featured
- Speaker: The lecture narrator / instructor (e.g., “Dear students…”, “I am going to explain…”, “from Azure’s point of view…”)
- Source/technology mentioned: Microsoft Azure (primary example) and AWS (mentioned as an analogy/generalization)
- No other specific named individuals or organizations were featured.