Summary of "Lec-12: Storage in Cloud 🌧️| Microsoft Azure Storage 🗃️Services with Real life examples"
Main ideas / concepts covered
-
Cloud storage purpose: Cloud storage is designed to handle very large volumes of data and provide:
- Proper management
- Performance
- Scalability
- Fast backup/recovery in case of disasters
-
Cloud storage strategy (data is not stored “directly” without planning):
- When data arrives, you first determine what type of data it is.
- Azure (as an example) provides different storage services tailored to different data types and usage patterns.
-
Azure storage service categories (as described):
- (1) Object / image-video/binary-style storage: Used for data like images, videos, binary data (with different “varieties” in designed services).
- (2) File storage: Used for text files and general files that need quick sharing/access.
- Example: A multinational company where employees in different locations share files remotely via a repository.
- (3) Queue storage: Used for messaging/requests that need asynchronous processing (to handle time delays).
- Example: After a concert, a huge number of requests/messages arrive; queue storage stores them so processing can be done efficiently.
- (4) Table storage: Used for schema-less / non-relational data, described as key-value pairs.
- Example: IoT sensors (temperature/humidity) continuously generate data, stored using unique keys and corresponding values.
- (5) Disk storage: Used for structured/sequential data using specialized structure (described as “structured disk storage” for sequential data).
-
Key principle: redundancy is essential
- Data is not kept in only one place.
- Azure architecture described as:
- Region → multiple zones → multiple data centers
- Policy mentioned: at least three copies within a region (and also copies across regions).
- Rationale: ensures availability even if:
- a data center fails,
- a zone fails,
- or an entire region faces a disaster.
-
Availability / “nines”
- The lecture emphasizes extremely high availability (e.g., 999.999).
- More nines = more important and reliable.
-
Tier-based storage (hot → cold → archive)
- Not all data has the same importance/frequency.
- Data is labeled by priority (examples: highest / VVIP / VIP / common).
- Caching stores frequently requested data near where it’s needed for faster access.
- Hot tier: accessed very frequently (e.g., most recent/high-demand data).
- Cold tier: accessed less often (e.g., every 2–4 days or around 10 days).
- Archive tier: rarely accessed (e.g., not used for a year).
- Automatic movement described:
- An email from a week/month ago moves from hot → cold.
- If not accessed for ~6 months to a year, it moves to archive.
-
Storage media cost optimization
- Higher-demand data uses faster/expensive storage (e.g., SSD).
- Lower-demand data uses cheaper/denser media (e.g., HDD).
- Archived data can use tape drives / magnetic tapes to reduce cost.
-
Generalization beyond Azure
- The same conceptual approach applies to other cloud providers (AWS referenced): data-type-based storage selection, redundancy, and tiering remain core ideas.
Methodology / instruction-like steps (detailed)
-
On incoming data:
- Identify data type/variety (e.g., images/videos/binary, files/text, messages/requests, key-value style data, structured sequential data).
-
Choose the appropriate storage service based on data type:
- Use the corresponding category:
- object-style for media/binary,
- file storage for shared files,
- queue storage for async messages,
- table storage for schema-less key-value data,
- disk storage for structured/sequential data.
- Use the corresponding category:
-
Ensure redundancy by design:
- Store multiple copies:
- At least three copies within the same region,
- plus additional copies across zones and other geographic regions (intra- and inter-region redundancy).
- Store multiple copies:
-
Apply tiering based on access frequency:
- Label data by priority/importance.
- Keep frequently accessed data in hot tier.
- Move moderately accessed data to cold tier.
- Move rarely accessed data to archive tier.
-
Use caching to speed access for high-frequency data:
- Cache data on the nearest server to reduce response time.
-
Match storage media to demand to control cost:
- Use SSD for hot/high-demand data,
- hard disks for medium-demand data,
- tape drives/magnetic tapes for archive/low-demand data.
-
Aim for high availability (“nines”):
- Maintain infrastructure and redundancy so the service remains available even during disasters.
Speakers / sources featured
- Speaker: The lecture narrator / instructor (e.g., “Dear students…”, “I am going to explain…”, “from Azure’s point of view…”)
- Source/technology mentioned: Microsoft Azure (primary example) and AWS (mentioned as an analogy/generalization)
- No other specific named individuals or organizations were featured.
Category
Educational
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.