On Mon, Mar 18, 2002 at 08:52:19PM -0500, Robert Haeckl wrote:
> I'd like to start a thread that discusses the inherent difficulties with
> opening source and still protecting the creators investment and
> potential profit. This has nothing to do with other issues about the
> legality of copying and sharing binaries via whatever means. It's about
> one agent using the code of another and claiming it as their own. I
> think that software development should not have to rely on services to
> make a profit, and at the same time, users of such software should have
> the choice to repair as necessary and screen for unexpected behavior.
>
> I don't want to get into any ideological veins such as programs
> inherently build on the work of prior development, hence nothing can be
> claimed as original. I don't believe basic structure and algorithms in
> general should be copyright protected. But at some level, there must be
> a way to say that such-in-such source code contains a substantial amount
> of code that is identifiably the same as that in another program's
> source code. This implies that at some level of complexity, software can
> be considered unique and original.
>
> A simple diff wouldn't provide the means to do this because naming
> schemes, comments, and general syntax could be easily modified. So my
> first question would be, "is there a present means to compare source
> code and measure similarities in a reproducible and standardized fashion
> _and_ detect methods of circumvention?" Something along the lines of
> DNA sequencing that would hold up under legal scrutiny.
>
IANAL.
You can only copyright the expression of something. You can only patent
the underlying structure. Now of course, if you take a piece of code and
just change all the variable names, it's likely that the copyright owner
can make a good case for copyright infringement.
In response to your direct question, I know of no tool that analyzes the
structure of two pieces of code to determine whether they substantially
represent the same idea. That said, though, a binary diff of the
executables might turn up this fact. Simply changing variable names
would not substantially effect the binary, since variable names are
really stored as pointers or actual values. Naturally, both pieces of
code would have to be compiled under the same compiler for this test to
have any meaning.
I'm not sure what your ultimate point is, though. In the Open Source
world, there isn't much concern about whether this code ultimately came
from someone else or not. The general idea is that code is for sharing,
and as a courtesy, we acknowledge the contributions of those who
originated the code. In this sense, Open Source coding is like
scientific inquiry, at least the way it used to be done. Discoveries
were openly shared, in the interest of others expanding upon them and
widening the general sphere of knowledge.
But it sounds as though you're attempting to solve a specific problem.
I'll assume it's like this: You write some code, you release it under
some Open Source license, someone else captures that code, perhaps
augments it, and makes a fortune off of it. Meanwhile, you struggle
along in your programmer's hovel, trying to eke out a living. And you
want to somehow level the playing field. Does that sound right?
If that's even close, you're probably out of luck. Nothing in any Open
Source license prevents anyone from making a fortune off someone else's
code. But by definition, they must also release source code. This fact
levels the playing field. It's a little hard to make a fortune off a
piece of code when someone else has the exact same code to hand. They
charge $1000 and you charge $50. You kill their market. Theoretically.
Of course, they could add value, which could include closed source code,
under certain conditions.
But it sounds like part of your problem is really the difference between
patents and copyrights. The problem is that, for a given software
problem, there really isn't an infinity of _workable_ solutions. If you
want to loop through something, you really only have a few ways to do
it right. And unfortunately, many software solutions are only solved in
really one way. For example: b-tree code. There really is only one way
to do it. You can add caches and various bells and whistles, but in the
end, a b-tree is still a b-tree. How you get from point A to point B can
be patented, but now you've really cut off innovation entirely. If IBM
had patented all the stuff about the original PC, we might be running
some version of CP/M today, with a much more fragmented market. Look
what happened to the microchannel architecture. As soon as IBM patented
it and started demanding royalties for various things, no one wanted to
touch that architecture. And it's dead today.
Alteration of code can be detected to some extent, for the purposes of
copyright protection. It probably requires something like a "forensic
code analyst". But it's most likely a manual process, and even then, you
still have the problem of how many really workable ways there are to get
from point A to point B. If there's only one way, you may not have a
copyright case. And if you didn't patent the algorithm, you've got
nothing to help you in the patent arena.
Companies like Xerox, IBM, Intel and others most likely employ people
who study competitors' technology to determine if it has been
"appropriated". And even then, there's no guarantee you can make the
case. An example is OpenDOS, which was built in a clean room environment
to mimic the functionality of MSDOS. The builders of OpenDOS arrived at
the same place as MSDOS, but did so only by studying the outward
manifestations of MSDOS. They could not be prosecuted, since they did
not have access to the prior MSDOS code. They effectively built a
typewriter only be observing people typing on one, not by taking one
apart. Of course, Microsoft fixed them by coding various hidden
"gotchas" into MSDOS at the next rev. As a result, various versions of
Windows 3.X returned spurious errors when run on top of OpenDOS, even
though functionally, the two were nearly identical. (I love people who
defend Microsoft; they conveniently forget incidents like this, which
mark Microsoft as a dishonest bully.)
HTH,
Paul
This archive was generated by hypermail 2.1.3 : Fri Aug 01 2014 - 20:11:05 EDT