Quantcast
Channel: aviyehuda.com
Browsing latest articles
Browse All 18 View Live

Image may be NSFW.
Clik here to view.

JavaScript encapsulation & the module pattern

Encapsulation is one of the key features of object oriented programming languages. In languages like Java, it is very straight forward concept to implement. Since I know JavaScript is considered an OO...

View Article



Image may be NSFW.
Clik here to view.

jQuery Deferred – one step closer to desktop apps

Every time I forget why I like jQuery, they keep reminding me. Not too long ago I came across jQuery deferred (even though it was added already in JQuery 1.5) and I immediately liked it. I feel this...

View Article

Image may be NSFW.
Clik here to view.

Best code convention syndrome

Developers often tend to think that one coding convention is better than another in terms of readability. Some people think that adding a break before the curly braces is more coherent. Some like...

View Article

Image may be NSFW.
Clik here to view.

How to properly collect AWS EMR metrics?

Working with AWS EMR has a lot of benefits. But when it comes to metrics, AWS currently does not supply a proper solution for collecting cluster metrics from EMRs. Well, there is AWS Cloudwatch of...

View Article

Image may be NSFW.
Clik here to view.

The right way to use Spark and JDBC

A while ago I had to read data from a MySQL table, do a bit of manipulations on that data and store the results on the disk. The obvious choice was to use Spark, I was already using it for other stuff...

View Article


Image may be NSFW.
Clik here to view.

Quick tip: Easily find data on the data lake when using AWS Glue Catalog

Finding data on the data lake can sometimes be a challenge. At my current workplace (ZipRecruiter) we have hundreds of tables on the data lake and it’s growing each day. We store the data on AWS S3...

View Article

Image may be NSFW.
Clik here to view.

Coalesce with care…Coalesce Vs. Repartition in SparkSQL

Here is a quick Spark SQL riddle for you; what do you think can be problematic in the next spark code (assume that spark session was configured in an ideal way)? sparkSession.sql("select * from...

View Article

Image may be NSFW.
Clik here to view.

Spark and Small Files

In my previous post I have showed this short code example: sparkSession.sql("select * from my_website_visits where post_id=317456") .write.parquet("s3://reports/visits_report") And I asked what may be...

View Article


Image may be NSFW.
Clik here to view.

Parquet data filtering with Pandas

When it comes to filtering data from Parquet files using pandas, several strategies can be employed. While it’s widely recognized that partitioning data can significantly enhance the efficiency of...

View Article


Image may be NSFW.
Clik here to view.

Data Engineering: Strategies for data retrieval on multi-dimensional data

You’ve likely heard about the benefits of partitioning data by a single dimension to boost retrieval performance. It’s a common practice in relational databases, NoSQL databases, and, notably, data...

View Article
Browsing latest articles
Browse All 18 View Live




Latest Images