December 28, 2007 at 6:43 pm
· Filed under DB, Java, Open source, Software
There are situations when you can’t put all data which you need in a single instance of a relational database. The reasons may differ. Maybe because it is too much of the data itself. Or there is a problem with network latency of a distributed architecture. Scaling? Or one of many other reasons. The answer is: horizontal partitioning. The process of splitting up your data sets that once has been called partitioning, now, it has a new name: sharding. If your data doesn’t fit on one machine you split it up into segments and each segment is called a shard. It is term used initially at Google but now spreading everywhere.
So… There is a new project Hibernate Shards which is a framework that is designed to encapsulate and minimize complexity of accessing multiple databases by adding support for horizontal partitioning on top of Hibernate Core. It was once the 20 percent project at Google but it is open-sourced now and licensed under the LGPL. If you know the Core Hibernate API you know the Shards API as its implementation hasn’t violated the Core API. Basic assumptions and paradigms for using Hibernate are still valid as Configuration, SessionFactory, and Session objects are almost exactly the same. Interfaces from Hibernate Core:
- org.hibernate.Session
- org.hibernate.SessionFactory
- org.hibernate.Criteria
- org.hibernate.Query
have shard-aware extensions:
- org.hibernate.shards.session.ShardedSession
- org.hibernate.shards.ShardedSessionFactory
- org.hibernate.shards.criteria.ShardedCriteria
- org.hibernate.shards.query.ShardedQuery
The implementations for these four interfaces serve as a sharding engine that knows how to apply an application-specific sharding logic. This logic is a set of rules how data is distributed across its shards. To specify this logic we have to implement the interfaces below:
- org.hibernate.shards.strategy.selection.ShardSelectionStrategy
- org.hibernate.shards.strategy.resolution.ShardResolutionStrategy
- org.hibernate.shards.strategy.access.ShardAccessStrategy
Hibernate Shards comes with a couple of simple implementations of these interfaces. For instance for the Shard Selection Strategy we have choice of sequential (load balanced round robin) or parallel access.
To finish the topic of the sharding logic I shall mention the id generation as well. As the standard database sequences can’t be used in distributed environment we have a choice of two primary key generators:
- ShardedUUIDGenerator - that generates a big random number
- ShardedTableHiLoGenerator - that uses a table in one of the shards to generate the primary key. Obviously it is a single point of failure for our system.
Although project is still in its early stage and even the creators warn about possible glitches this nice wrapper for hiding all the complexity of distributing data around multiple relational databases is very much worth looking at, especially if the Hibernate is already used as the ORM tool of choice.
Popularity: 50% [?]
Share This
Permalink
December 25, 2007 at 1:21 pm
· Filed under Conference, Video
For those ultra-geeks who didn’t have a chance to go to Paris for the LeWeb03 and to Antwerp for the JavaPolis conferences but are curious and want to spend some time during/between eating, drinking, meeting family and friends to catch up what interesting happened there I found some video coverage from the conferences.
LeWeb3
JavaPolis
Enjoy!
Popularity: 31% [?]
Share This
Permalink
December 24, 2007 at 4:25 pm
· Filed under Identity, Internet, Web2.0
If you google yourself often then don’t be ashamed. It is not a sign of your vanity or being an egocentric. If you do it regularly it means that you are in the top 3% of internet users. Users who care what The Web knows about them. Conscious, sane and sensible users.
As Pew Internet and American Life Project says in their report about the online identity management and search in the age of transparency more and more people are aware of their publicly available digital footprint. 47% have searched for information about themselves online, up from just 22% five years ago. There are as well interesting finding about our opinion about the amount of information about us available online, need for self-promoting online, transparency in the social networks and how many of us used search engines to follow others’ digital footprints.
So, if you can find time have a look at full report (PDF), it is really interesting.
Ok, enough for now. It is almost Christmas, isn’t it? I would like to take this opportunity and wish all of you Merry Christmas and Happy New Year!
Popularity: 35% [?]
Share This
Permalink
December 17, 2007 at 11:54 pm
· Filed under Books, Money
I hasn’t been ever interested in money. But at some point my friends started to talk more and more about money, buying flats or investing. Then I was thinking:
What’s wrong with you guys? Don’t you have something interesting to do? You are getting so old and boring!
I was happy with my income and what is even more important for me I was doing as my job something which I liked. Money is just non-interesting but necessary addition to that activity. But who cares?
Do you recognise some very common in our industry pattern here? Industry full of geeks focused only on “cool” stuff where money is certainly not regarded as something with big kick of coolness? I read recently Rich Dad, Poor Dad by Robert Kiyosaki. I can recommend it as good start to learn about this so much hated by us subject of money and maybe change that point of view. And I think the sooner the better. It can only be beneficial for us!
Popularity: 28% [?]
Share This
Permalink
December 12, 2007 at 11:34 am
· Filed under Gadgets, Hardware
After reading latest Coding Horror I decided to write about my treat for this Christmas. Wholesale buying of crap begins! This year I decided to buy Audio-Technica ATH-ANC7 QuietPoint noise-canceling headphones and thanks to generosity of my friend who went back home for Thanksgiving and Black Friday I am a lucky owner of them. I’ve been using those headphones for last two weeks and have to say that I am very happy. I am not going to write full review of this product however I have to say couple of words of my opinion.
The headphones are equipped with noise canceling system which works surprisingly well. It makes my daily commuting a pleasure now, I can listen to music on the tube without putting volume on max. I haven’t tried them on the plane yet but I have heard from other people that it cancels engine noise very well. We can use them as headphones with our favourite music player or only for noise protection as the cable can be easily disconnected. The sound itself is just great: very rich, with solid base and vivid high register. Pleasure! Nice feature is an ability to work in passive mode, so the headphones can be still usable for listening to music even if we run out of juice in battery. I tried as well Bose Quiet Comfort headphones and I think it is the same league of quality and the better value option.
What’s your geeky treat this year?
Popularity: 25% [?]
Share This
Permalink
December 9, 2007 at 8:32 pm
· Filed under Blogroll, General
My friend Rags wrote about work-life balance lately. He writes about challenges we have to face in current world as professionals:
If I regularly take time off after work to just “chill out and relax”, I realize I’m being left behind in the rat-race while everyone else is constantly moving forward.
As we live in society of ubiquitous flow of information and changing all the time trends and opinions but we have got limited time we are bound to make constant decisions: what news do we follow, which blogs and books do we read, which technology we are going to try and spend our precious time with. We could try to find an analogy with finance sector: we have a limited resources, in our case it is time, and we have to invest this in the best way to get the best result. To quantify this result can be difficult as it is not as obvious what to take into account: career progress, money we earn, being well rounded and conscious professional or maybe the degree we influence and interact with our closer or farther professional neighborhood?
Did I forget to mention private life?
Popularity: 20% [?]
Share This
Permalink
December 9, 2007 at 6:46 pm
· Filed under Software, TDD, Testing
I believe in testing, I really do. Combined with continuous integration, daily builds and automation of all aspects of build process, those are for me mandatory elements of a software project which has got any aspiration to be a successful one. Joel thinks there are more.
I believe in unit tests as well as integration and acceptance tests. I can see value of that and I use TDD happily for last two years and unit testing without being so strict about that for much longer time. It just works for me, gives me something.
However I still have questions regarding this subject. Let’s just put up first of them:
Should we or shouldn’t we unit test private methods of a class?
There are valid reasons for both of them. From one hand if we limit ourselves to a public interface and we test our class as a black box which behaves in a certain way and we don’t care how it is implemented then the tests are easier (read cheaper) to maintain. Whenever we have to change implementation we don’t have to change tests. Neat.
But there is other side as well. What about this word unit? What about TDD? If TDD can actively help us with process of developing a single unit of code why not to use that? As we all know the first red - last green routine of writing test, making it fail, developing code, making the test pass might be painful, might take long time to switch to that way of development but and the end of the day gives code of the better quality. But now what shall we do if that single unit of code by design is a private method? Shall we change that following philosophy of design for testability? It again might lead to innocent changes like converting (Java):
private void foo() {
...
}
to:
protected void foo() {
...
}
Then we can create unit tests for that method in the same package. It was really innocent. What if we have to reach to a toolbox with dirty tools? We can’t really use the same solution for C#. Let’s first have a look at the method access modifiers for that language:
- public indicates the method is freely accessible inside and outside of the class in which it is defined
- internal means the method is only accessible to types defined in the same assembly
- protected means the method is accessible in the type in which it is defined, and in derived types of that type. This is used to give derived classes access to the methods in their base class
- protected internal means the method is accessible to types defined in the same assembly or to types in a derived assembly
- private methods are only accessible in the class in which they are defined.
As we can see our dear friends from Microsoft have done it in even more problematic way. We have a choice of converting private methods to internal but our tests have to be placed in the same assembly as the production code or we can use the public modifier or we have to use some tricks with reflection. Not perfect, wrong or smells.
I understand that there is no universal answer for the question above. Applying common sense usually pays off but I wonder what is your opinion about testing in general with focus on testing private methods.
Popularity: 27% [?]
Share This
Permalink
December 6, 2007 at 12:47 pm
· Filed under Mashup, Web2.0
If social networks and community pages don’t mean just fun for you and you are interested in understanding mechanics of business models for those services then you might be interested in a raport about social services published by FaberNovel. This report consist of three parts:
- general rules of social networks
- case studies of matchmaking services Meetic and Match.com
- case studies of business networks Xing and LinkedIn
The authors of the report grouped all web 2.0 services into four groups:
- online communities - which main focus is on “socializing”
- business networks - with focus on career and business opportunities
- online matchmaking - dating services
- alumni networks - helping to stay in touch with friends from school or university
Even if it looks that there should be much more types of services this hierarchy seems to cover most of cases missing maybe only some specializations of those types.
Report in PDF to download here.
Popularity: 20% [?]
Share This
Permalink
December 2, 2007 at 6:42 pm
· Filed under Open source, Software, Web2.0
I can recall a conversation with psd from early this year about why I think current way of uploading photos to flickr sux big time. I treat flickr as repository and backup of _all_ my photos (12525 photos at the moment, still growing) and not just as a photo blog. It means that very often I have to upload quite a lot of them in batches. The existing solutions just make me angry (how shabby is the standard Flickr Uploadr!) when I have to tag them and correctly name or create sets. I am not (yet) a mac user. Maybe there is something more user friendly, I hope.
So… Coming back to the conversation with psd I have told him that unless there is some clever integration with OS which will make the process of uploading more smooth it will be always the pain. I expected as well a bit more metadata to be populated by camera automatically. How cool would be to have GPS and geo-tag all the photos when taking them. That’s not the end. I would expect as well to be able to define some tags in a camera.
That would solve a bit problem with tags. The other problems are setting permissions, creating sets etc. For pretty long time I couldn’t find any tool which I liked. Lately after migrating all my home computers to Gutsy I found something called Flickrfs as an available package. It is a virtual filesystem which mounts a flickr account as any other data file storage. It synchronizes the flickr account with local filesystem and shows photos as images files with all metadata represented as text files.
Imagine that. You can upload photos to flickr just by using cp command, the same applies to downloading photos (even if Flickr itself makes it as hard as possible to get back of your own photos). Deleting is as easy as invoking rm. You can set permissions using standard chmod command and define sets by creating symlinks with ln.
I like the idea itself very much. The Flickrfs is created and maintained mainly by one person so there might be some small “bugletes” but I still support this project and wish it the best. Well done Manish Rai Jain!
Popularity: 28% [?]
Share This
Permalink