compact representation of gradeup petmutation vector
Forum rules
This forum is for discussing APL-related issues. If you think that the subject is off-topic, then the Chat forum is probably a better place for your thoughts !
This forum is for discussing APL-related issues. If you think that the subject is off-topic, then the Chat forum is probably a better place for your thoughts !
3 posts
• Page 1 of 1
compact representation of gradeup petmutation vector
b
1010data started as database as a service about 15 or 20 years ago, and is now "big data in the cloud" ... the underlying language is k3.
I was at a kx demo where 1010data spoke. the demo was on a 73 billion row table.
the speaker said something like, "I don't have to explain the power of vector languages to you. since we can keep the resulting permutation vector of a grade up there are all sorts of things we can do" ... at which point i raised my hand
"uhm, 73 billion 64 bit floats is a big array to keep around. even if it wasn't in memory it would take a long time to write to disk."
the answer was something like "obviously we don't use a couple hundred gigabytes of floating permutation vector, we use bit maps"
which leaves me totally confused. anyone have any idea of some bit map compression (or compact representation) of what essentially amounts to an arbitrary re-ordering of iota 73billion
1010data started as database as a service about 15 or 20 years ago, and is now "big data in the cloud" ... the underlying language is k3.
I was at a kx demo where 1010data spoke. the demo was on a 73 billion row table.
the speaker said something like, "I don't have to explain the power of vector languages to you. since we can keep the resulting permutation vector of a grade up there are all sorts of things we can do" ... at which point i raised my hand
"uhm, 73 billion 64 bit floats is a big array to keep around. even if it wasn't in memory it would take a long time to write to disk."
the answer was something like "obviously we don't use a couple hundred gigabytes of floating permutation vector, we use bit maps"
which leaves me totally confused. anyone have any idea of some bit map compression (or compact representation) of what essentially amounts to an arbitrary re-ordering of iota 73billion
- tclviii-dyalog
- Posts: 28
- Joined: Tue Mar 02, 2010 6:04 pm
Re: compact representation of gradeup petmutation vector
- You could have (should have) asked the speaker for more details.
- The table had LOTS of duplicate rows. Even then it's dicey, because a single bit vector with 73e9 entries is already 9 GB.
- The ordering is not random but has structure that can be exploited. e.g. Nobody creates a 73e9 row table from scratch. Therefore there is a large existing table, already ordered, and you just need to know where in the large table to slot in some small number of new rows.
- The table had 73 MILLION rows rather than 76 billion rows. :-)
- The table had LOTS of duplicate rows. Even then it's dicey, because a single bit vector with 73e9 entries is already 9 GB.
- The ordering is not random but has structure that can be exploited. e.g. Nobody creates a 73e9 row table from scratch. Therefore there is a large existing table, already ordered, and you just need to know where in the large table to slot in some small number of new rows.
- The table had 73 MILLION rows rather than 76 billion rows. :-)
- Roger|Dyalog
- Posts: 238
- Joined: Thu Jul 28, 2011 10:53 am
Re: compact representation of gradeup petmutation vector
I'm fairly sure that they block their data. A multi-billion row table is spread out over multiple machines. Some form of map/reduce is used to process manageable chunks on multiple machines, which are then aggregated back on the controlling machine. Furthermore, if it is time-series data, it will be sharded or blocked by time, so the the table is essentially presorted. There will never be a 73 billion item vector of any sort in use.
- paulmansour
- Posts: 420
- Joined: Fri Oct 03, 2008 4:14 pm
3 posts
• Page 1 of 1
Who is online
Users browsing this forum: Google [Bot] and 1 guest
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group