Google-apps
Hoofdmenu

Post a Comment On: C0DE517E

"Collecting garbage"

12 Comments -

1 – 12 of 12
Blogger Unknown said...

The problem with statements like GC is faster than alternatives such as reference counting, is that it ignores the fact that GC languages typically churn out much more garbage than manual allocation languages. Most memory in C/C++ ( above 90%) is allocated on the stack, which is either completely free or a single push/pop instruction.

May 27, 2008 at 11:13 PM

Blogger DEADC0DE said...

But I was talking about GC not about any specific language...

Anyway that is another misconception, even for languages that have only reference types (i.e. Java, C# has structs that are value types), it's well possible for the VM to know if the lifetime of an object is limited to the scope of a function and so do not use its GC heap for those objects.

It's also kinda easy to do that kind of checks, if you don't ever copy the reference outside a given scope, you know that that scope corresponds to the life of the object...

May 27, 2008 at 11:42 PM

Blogger Unknown said...

Nice write up.

I'd like to add that every memory allocation in GC based platforms, like the JVM, is almost as fast as allocating on the stack in C++. The reason is a big chunk of physical memory is allocated on the heap in the beginning and then only assigned to the object. What is really much slower is the 'collection and freeing'.

Ian's point also holds true. However, Java 7 is expected to include full Escape Analysis, which allows for lots of optimizations. Of course GC is not enough, but dynamic compilation is also required for them:

1. Conversion of heap allocations to stack allocations when possible
This should get us close to the percentage Ian mentioned. It is comparable to pointer analysis for static compiled languages, but is not bound by virtual methods and linking modules.

2. Synchronization Elision / Lock Coarsening
No unnecessary synchronization for multithreaded code. Already implemented in Java 6

3. Breaking up objects or scalar replacement
Further step to preventing fragmentation: Parts of objects might be in different memory locations to further increase the page friendliness..

May 28, 2008 at 12:31 AM

Blogger DEADC0DE said...

Exactly, escape analysis is capable of converting heap allocations to stack ones if needed.

GC allocation performances are very fast in moving, compacting collectors as you have much lower fragmentation than heap allocators.

And while it could be true that deallocations are slower for a GC (but well, that's really implementation dependant), the truth is that we don't use standard heap allocators very much, we use class pools, that are well possible in GC environments as well, or reference counting that is usually inferior to GC as it trashes the caches if chain deletions occour (and still uses the slow, and more fragmentation prone heap allocator for object creation)

May 28, 2008 at 12:43 AM

Blogger won3d said...

Plainly, GC as a way to simplify memory management is not entirely true, since it only transparently helps in the simplest cases. Note in Java you must still assign references to null when you are done with them, and there are also 3 different kinds of weak, non-owning references. GC does not mean that you can ignore ownership semantics, which must still be solved in design.

However, GC is still very important and necessary. You are certainly right about composition, particularly in the face of non-determinism. Defining lock-free data structures are far simpler when you can rely on automatic garbage collection -- it merely requires heroic effort, not divine powers.

There is one important thing that is only practically achievable via garbage collection: type safety. Manually collected handles may dangle and pun an object of a different type. This is impossible with garbage collection.

May 28, 2008 at 11:03 AM

Blogger DEADC0DE said...

That was my point, GC is not about simplyfing, is about delegating memory management to a separate system, that in turns makes composition of other systems easier.

About the type safety I don't agree, again we are confusing language features with GC features. If you make a GC in C++ (i.e. the BohemGC) then you'll have a GC system but of course you won't end up having any more safety than what you had with explicit memory allocation. Is it true that most languages that natively use GC also are type safe, as this makes writing the GC easier, but that is still a language feature.

May 28, 2008 at 4:30 PM

Blogger won3d said...

Believe me, I'm a fan of GC.

What I meant was that GC is practically necessary, but not sufficient, for type safety. One alternative to GC is to never collect objects, but that is hardly practical.

GC is kind of tricky in a concurrent environment, since the simple stop-the-world collection doesn't really work. And it is hard to do compacting compression incrementally. Etc, etc.

That being said, modern GC works pretty damn well.

May 29, 2008 at 4:13 PM

Blogger DEADC0DE said...

Type safety is a language feature, not anything that has to do with GC. C++ without reinterpret_cast could still be type safe even with new and delete...

May 29, 2008 at 6:57 PM

Blogger won3d said...

C++ would need more than a prohibition on reinterpret_cast to be type safe (e.g. pointer arithmetic, unchecked downcasts, etc). I would argue more than would be practical or useful. I like C++, btw.

Example of why you need automatic garbage collection for type safety: Say I free an object. Malloc can return that same value pointer in a future malloc, but I could still dereference the original handle, and the types no longer have to match. This can't happen with automatic garbage collection, since holding a handle guarantees that the type is stable.

June 2, 2008 at 1:05 PM

Blogger DEADC0DE said...

True that C++ needs more than removing reinterpret_cast, it was just an example...

And you're also completely right about the malloc problem, indeed it does violate type safety, but that is still a C++ problem, you could easily do type-safe explicit memory allocation, for example by storing a header with type info and checking it before dereferencing a pointer...

In general every memory problem can violate type safety in C++ (this is another very cool example: http://www.bluebytesoftware.com/blog/2008/02/10/ViolatingTypeSafetyWithTornReads.aspx), but it's because C++ does not do any runtime check, it's not something that has to do with explicit memory management in general.

On the other hand, I can't think of any easy way to statically enforce type safety with malloc and free, mhm, but that could even just be due to my lack of imagination...

June 2, 2008 at 2:08 PM

Blogger won3d said...

Yeah, I guess since we were talking about C++ I was presuming a static solution, but if you go dynamic then you're all good. And then you can have compiler analysis remove redundant checks, etc, but this starts moving away from a low-level systems language. Anyway, if you think of an alternative, then be sure to tell people!

June 2, 2008 at 3:05 PM

Blogger DEADC0DE said...

No my post was about GC in general, not about any given language. But your observations were really intresting, thanks!

June 2, 2008 at 7:44 PM

You can use some HTML tags, such as <b>, <i>, <a>

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.
Please prove you're not a robot