⛃ Where to Start in Your Database Career (Notes from the Field)

Trying not to be the grumpy old man, but just give some sage and helpful advice

What’s in today’s newsletter:

  • Dremio launches hybrid data catalog for Apache Iceberg 🚀

  • Snowflake enhances data security with AI features 🔒

  • AWS launches Apache Iceberg for improved data management 🌐

  • Matillion recognized as Challenger in Gartner report! 🚀

DATA STORAGE & ENGINES

TL;DR: Dremio launched the first hybrid data catalog for Apache Iceberg, enhancing dynamic data management, governance, and compliance across hybrid and multi-cloud environments, driving business agility and data quality.

  • Dremio has launched the first hybrid data catalog specifically designed for Apache Iceberg, enhancing data management.

  • This catalog enables organizations to manage data across various formats and locations seamlessly, promoting agility.

  • Advanced features include automated data lineage and an intuitive interface, simplifying the data management process.

  • The catalog significantly improves data governance, compliance, and security for organizations leveraging hybrid and multi-cloud environments.

Why this matters: As businesses navigate complex data landscapes, Dremio's hybrid catalog for Apache Iceberg simplifies cross-environment data management, enhancing compliance and security. This tool supports better business agility and informed decision-making, offering a competitive edge in industries increasingly reliant on sophisticated data solutions. 

SNOWFLAKE

TL;DR: Snowflake has launched AI-powered security features to enhance data protection, including real-time anomaly detection and automated incident response, significantly improving compliance and customer trust in cloud environments.

  • Snowflake has introduced AI-driven security features to enhance the protection of sensitive data on its platform.

  • The "Snowflake's Data Protection" feature monitors data access patterns and detects anomalies in real-time.

  • Automated workflows for remediation utilize AI to streamline incident response processes and improve recovery times.

  • These enhancements bolster data security, aid compliance, and support customer trust within the tech industry.

Why this matters: Snowflake's AI-driven security features signal a crucial evolution in data protection. As data becomes central to business strategy, integrating AI strengthens security, aids compliance, and maintains customer trust. This not only reduces breach risks but also ensures regulatory alignment, fostering enhanced growth and innovation in the tech industry.

AWS


TL;DR: AWS launched a hosted Apache Iceberg service on S3 to improve data management, simplify data lakes, enhance querying, and reduce costs for enterprises focused on big data analytics.

  • AWS has launched a hosted Apache Iceberg service to enhance data management on its S3 storage solution.

  • The new service simplifies data lake management and includes a metadata management layer for efficient querying.

  • This launch aims to reduce operational costs and boost productivity for enterprises reliant on big data analytics.

  • By adopting Apache Iceberg, AWS strengthens its competitiveness against other cloud platforms focused on data analytics.

Why this matters: The hosted Apache Iceberg service on AWS S3 empowers organizations to efficiently manage expansive datasets, enhancing productivity and reducing costs. By offering streamlined data retrieval and insights, AWS not only strengthens its position against other cloud platforms but also supports enterprises in optimizing big data utilization. 

DATA ENGINEERING

TL;DR: Matillion has been named a Challenger in Gartner's Magic Quadrant for Data Integration Tools, highlighting its innovative cloud integration solutions and potential for attracting clients and investors in data management.

  • Matillion has been recognized as a Challenger in the Gartner Magic Quadrant for Data Integration Tools.

  • The company's focus on cloud data warehouse integration has significantly contributed to its growth and influence.

  • Matillion continues to enhance its product features to meet evolving customer needs in data management.

  • This recognition may enhance Matillion's appeal to potential clients and investors in the competitive market.

Why this matters: Matillion's recognition as a Challenger in the Gartner Magic Quadrant validates its impact in the cloud data integration domain. This boosts its credibility, enabling it to attract clients and investors, and positions it as a key player to help businesses make data-driven decisions, crucial in today’s competitive technological landscape.

DEEP DIVE

Where I would start in 2024

This week, please be advised I scrapped my previously planned mutterings about Apache Polaris to talk about something I thinks is a bit more pointed and meaningful than the 10,000 different technologies I occupy my time with.

Hard skills to have

To anyone that has the desire to start a data career, from what I have seen in the field, here are the top technologies and skills that I would hone in 2024 and beyond if you were starting in a Database career:

  • Python (especially Pandas)

  • SQL

  • Any one of SQL Server, MySQL, or PostgreSQL (I still don’t know how to pronounce it)

  • An understanding of Data Lakes

  • A fundamental course on the basics of cloud computing

  • Prompt engineering and some AI and ML skills

Soft skills to have

Some of the soft skills you need are of course critical thinking, If you are fresh out of a boot-camp or university, you may not have encountered those gritty situations where a key system is down, and you are doing your utmost best to solve said problem, and an annoying co-worker or manager is getting excited and peppering you with questions. Critical thinking is key. Keep your emotions out of solving challenging, technical problems

You need to be able to communicate effectively as well. Resources abound about effective communication. I talk about databases here, and not about talking and writing.

The third soft skills to have is is your mind set. Manipulating mountains of data can seem daunting, You basically have to develop your confidence to push through the problems you will encounter.

Conclusion

You might be saying, “is this really the Deep Dive section”. I’d say “yes, it is”. This is just a distillation of things like:

  • Dealing with a blackout 3 in the morning with the company owner telling you “call the data center. See if our stuff is working”

  • Missing data loads that impact the company $600,000 USD for the month

  • The rare, odd co-worker that can’t handle your capabilities

  • The challenge of staying relevant and current with your skills

  • Dealing with corporate shenanigans (I would love to get into those details, but this is not the forum; I’ll expound here in the near future, maybe)

  • Telling a pretty senior and inquisitive VP, “Don't worry J****, I’m a professional”, when my team and I had to re-build a pesky customer facing SQL Server instance from scratch (obviously I can’t say his name) on Super Bowl Sunday (the Eagles won BTW 41 - 33)

Just keep all this stuff in mind, if you are just started out in the database field. It can be a very rewarding career. I would just stress, augment your data skills with a bit of AI and ML going forward, and develop your soft skills.

To your success.

Gladstone