Log in

< back | 0 - 10 |  
jputnam [userpic]

Resisting the urge to rewrite

November 12th, 2009 (11:40 pm)

I'm writing a parser for My Own Language, like everyone does at some point. I spent about two weeks on it, and got expressions working and a few declarations and statements. Then I threw it out and started rewriting.

I am back where I was two weeks ago, a state I have been in for the last week and a half. Curiously, implementing the parser once had no impact on how long it is taking to reimplement it. I have reached the stage where I want to rewrite again. But that would be bad, not merely because the parser is all I have, but because writing the parser would become the goal itself, rather than getting a compiler working.

I wanted to use LLVM as a backend, but that doesn't seem possible, given their halfhearted support for multiple return and total lack of support for treating the call stack as a data stack.

jputnam [userpic]

What was I thinking?

October 31st, 2009 (07:49 pm)

When I wrote "★ → ★ is not inhabited", I meant that the types that inhabit it were not inhabited themselves, since, for example, every endofunctor has that kind. At the time, it seemed like the obvious meaning.

Well, that's not a productive way to look at it. At the term level, if restrict yourself to the terms that can be created by Church encoding, the only tool you have is lambda. The only tool you have at the type level is &rarr. Since you can mix those, the only things you will have at the type level are lambdas and arrows.

A term like λ a. e is a value, and is not inhabited. Its type is (a:b) →n c, which is a type. The type of that is ★n for some n, and is a kind. In other words, the type graph is a lobster, which is still quite manageable. So of course there are no inhabitants of the inhabitants of ★ → ★. Functions are not types.

jputnam [userpic]

(no subject)

April 24th, 2009 (11:52 pm)

Assume all values are tagged at runtime. Then

∀ a.a → a
can be reduced to
a:Dynamic -> b:Dynamic
  where tag a == tag b
This is the intuition behind type inference. Also, I wonder if I can do without ADTs and GADTs entirely and just use existentials and equality constraints.

I'll be unplugging my modem for a week (it will drive me up the wall, but that's the point). I can get my Reddit and Hacker News and webcomics at work. I might plug in to download papers or compilers.

This is from the LtU discussion on OO being intuitive:
If OOP is used instead to model the modeling system - i.e. to model a system of axioms and confidence rules, forward and backwards chaining, etc. (or, as I said above, to model the program) - the end product will (in general) be more flexible, more modular, would have more reusable components, would expose more invariants for the classes. The end result of doing so, is essentially a new language for constructing programs on-the-fly. That - configurable modularity - is the strength of OOP.
It doesn't matter who said it. When I hung out on comp.lang.ada, there was a similar quote to the effect that, in Ada, a good design models the solution space instead of the problem space. Isn't this embracing accidental complexity?

I have put off fixing function calls for too long. In accordance with the principles of the Cult of Done, I declare that I am not going to do it. It is therefore done. I'll be working on evaluating terms in a context in which all type information is available. Once that works, I'll start looking at type erasure. I started with the back end in order to always be able to construct something executable and because code emission was what I was most afraid of, and I wanted to overcome that. I've failed with the first objective but succeeded with the second.

jputnam [userpic]

Revisiting an early decision

April 22nd, 2009 (11:47 pm)

Don't get enamored with the complexities you have learned to live with (be they of your own making or imported). The lurking suspicion that something could be simplified is the world's richest source of rewarding challenges.
- EWD1055A
A while ago, I rejected the type function λa.a→a as the type of the polymorphic identity function. I recognized that it isn't in ★, so it couldn't be inhabited, so I couldn't have a polymorphic identity function. Instead, I postulated a double indexing of → and worked from there.

Double indexing of arrows is effective, but it's ugly and feels like wielding a sledgehammer. Here is one (broken) alternative:
★★ : ★★★
  |- L2L1

★→★ : ★★
  |- L1(G1L0(KL0))
  |- L1L0 ∧ L0a |- L1(KL0a)
given and trivial

λx.x→x : ★→★
  |- G1L0(KL0)(λx.G0x(Kx))
  L0a |- L0(G0a(Ka)) ∧ |- L1L0
GL and given
  L0a |- L0a ∧ L0a; ab |- L0a
environment lookup

? : λx.x→x
There is no such term, of course; ★→★ is not inhabited. Instead, we have:
  L0a |- G0a(Ka)(λx.x)
  L0a; ab |- L0(Kab) ∧ L0a |- L0a
environment lookup
This brings its own problems. How would I classify a→(∀b.b→b) and (∀a.a→b)→b?
  L0a |- ?(G0a(K(λb.G0b(Kb))))
L won't work; let's try Gxyz
  L0a; Lxw |- Lx(λb.G0b(Kb)) ∧ etc.
I find the idea of defining a new constant both attractive and repulsive. New constants show no promise of elegance over double indexing.

jputnam [userpic]

Connecting algebraic data types and existential types

April 22nd, 2009 (12:06 pm)

I still haven't taken the time to understand the Johann and Ghani paper, so this post will omit GADTs. In what follows, keep in mind the implementation of ordinary ADTs.

Suppose we can distinguish between types at runtime, perhaps by tagging all values. If we have a term of the following type:

exists (x:{Int; Char}) x
then its Church encoding is
forall a. (Int -> a) -> (Char -> a) -> a
and any function which uses it must deal with both types. The implementation therefore doesn't need to tag types at all; it (just) needs to hold the decision closure. When the vector of types from which x is taken has only a single member, the function isn't needed. In fact, as long as the type is taken from a finite vector and nothing is done with it, the decision function can be encoded as a simple number.

When I finish with the paper, I should be able to say whether the equality GADT is necessary for GADTs or whether the type constraint subsumes or is equivalent to it. I'm hoping for the latter.

jputnam [userpic]

Two problems

April 20th, 2009 (11:43 pm)

The previous nominal type system was unable to solve two simple problems. The first is deciding that our good buddy Vec is a type.

data Vec a n = VNil : Vec a 0
             | VCons a (Vec a n) : Vec a (n+1)
(I can never remember whether the index goes first or second.)

The second problem is, given a type that isn't fully polymorphic, assign a type to a function that produces a value of such a type. Assume that x is an Int.
data Restricted (a:{Int; Char})
    = One    : Restricted Int
    | NotOne : Restricted Char

restrict x | x == 1    = One
           | otherwise = NotOne
The ability to use {Int; Char} as a kind was the purpose of declaring ℘(★) − ∅ ⊂ ★★. (I forgot the empty set previously; please ignore that). If the constraint were at the head, the function caller could screw it up. The type has to encode the fact that the callee controls the result, rather than the caller. In short, it's an existential type. As a second choice, I can put the decision function into the type. That would make the type too complicated; this is a programming language, not a proof assistant. As a third choice, I can follow Haskell in just making the result type fully polymorphic, logical consistency and common sense be damned.

The choices are:
restrict : (x:Int) -> Restricted (if x == 1 then One else NotOne)
restrict : Int -> exists (a:{Int; Char}) (Restricted a)
Existential quantification doesn't add any power to the type system, (I think) as we have:
∃x.P ≡ ∀y.(∀x.P → y) → y
exists (x:A) e ≅ forall (y:B) ((forall (x:A) e -> y) -> y)
For some suitable definition of equivalence, and ignoring the issue of which level the computation takes place on. It does if I don't have polymorphism over levels, unless the keyword specifies the levels, which is not at all desirable. The similarity to the continuation monad is a pleasing confirmation that this, at least, I did not screw up too badly.

As for the first problem, the most obvious approach is to associate each type constructor with a relation between the contained values and the parameters of the result type. If the relation is a function, that simplifies type inference, at the cost of the ability to infer types of parameters from types of expressions. I need to reread the GADT papers.

jputnam [userpic]

Ordering function parameters

April 20th, 2009 (12:40 pm)

This felt like a big deal last night, but it doesn't seem that important now.

This seems like such a trivial thing, but it drove me crazy when dealing with SAS. There was no apparent organizing principle to parameters; they just showed up in whatever order they were thought of. It was compounded by the fact that nontrivial functionality was in these big monolithic procedures that took half a dozen parameters each. I've managed not to think about it, but I installed Factor yesterday.

Suppose I have a pair of hash tables, and the second table is a mapping from keys to hash tables. I want to think of this pair as a database table, so the first is going to have, for the time being, pairs of column names and default values. Ignoring the test to see whether that name is already present, I have

: extend-schema ( value key table -- ) first set-at ;
The function is named extend-schema, its stack effect is to drop three values, and it does so by taking the first element of the pair and setting the value at the key. Note the strange parameter ordering. Assuming that the key wasn't already present in the schema, I want to update all the entries in the table to have a key with the default value.

The library defines assoc-each ( assoc quot -- ) and the quotation should have the effect ( key value -- ), although this isn't enforced. The key and the value are in the "wrong" order. This doesn't matter yet, although it would if I wanted to change the values. So let's see what I have to do:
              ! value key table
second        ! value key entries
-rot          ! entries value key (I'm going to need to curry these)
extend-member ! entries value key quot (I'll explain shortly)
2curry        ! entries quot
assoc-each    !
extend-member therefore needs the type ( name row value key -- ). It also needs to use set-at, which had its parameters reversed. I need to use spin, whose effect is ( x y z -- z y x ):
: extend-member ( name row value key ) [ spin set-at drop ] ;
With the resulting function being:
: extend-table ( default-value key table -- )
    [ first set-at ]
    [ second -rot [ spin set-at drop ] 2curry assoc-each ]
    3bi ;
If, on the other hand, set-at were to take its arguments in the opposite order, I could write
: extend-table ( table key default-value -- )
    [ [ first ] 2dip set-at ]
    [ [ second ] 2dip [ set-at drop ] 2curry assoc-each ]
    3bi ;
It doesn't look like I've gained very much, but I've removed an operation from a potentially very long loop. But that price may come back in the new implementation, and the test to see whether the value was already present will be harder.


I guess in this case it isn't worth it.

jputnam [userpic]

Reintroducing names to the type system

April 17th, 2009 (12:02 am)

In a prior post, I described some inference rules for deciding when a type was valid in a structural setting. Although this is useful for intuition, nominal types are required for practical use. Nominal types are strictly more powerful than structural types, but they also create an asymmetry in the system, in that while a value-level expressions must contain only other values, expressions at any other level may freely mix and match values, types, kinds, sorts, and beyond.

I have to distinguish between indicators and the indicated. I will use the term type name to mean the name of a type and type construction to mean what the name refers to. Type will mean the combination of type name and the construction to which it refers. A type is valid if the construction is valid and the name is permitted to refer to it. Types will be referred to with ★, and type constructions with †. A type has the same index as its type construction.

Functions are much as before, but there's a way to make sums of products even more complicated:

If Γ |- A : ★m and Γ ; x:A |- B : ★n, then ((x:A) →(m-n,n) B) : ★n.

Let V be a nonempty vector of length I of possibly-empty vectors of independent lengths Ji such that for 0 <= i < I and 0 <= j < Ji, we have Γ ; (for all 0 <= k < j, xk:Vik) |- Vij : ★n. Then ΣiΠjVij : †n.
The latter is my best excuse yet for naming the type system System WTF. What remains is the task of naming the constructions. I will denote a type as a pair of a name and a construction.
Let N be a name, and Γ |- A : &dagger. Then Γ |- (N, A) : ★n.

Let N be a name and Γ ; x:A |- (M, B) : ★n. Then Γ |- (N x, B) : ★n. Note that N x is a name.
Dependent pairs appear for free, since any legal expression can go into a name. Names are essential in type checking and inference, but turn out to be trivial in construction. Modules remain elusive, since every element of a product must be at the same level.

jputnam [userpic]

A failed experience with Inform 7

April 16th, 2009 (01:01 pm)

Having tried Inform 7, I am now considerably less impressed by the system.

I understand why it was built. A text adventure system already has a natural-language parser. Wouldn't it be great to leverage that into the programming language? Someone was going to do it sooner or later, and it seem all the more attractive due to the shortcomings of object-oriented languages for the field. But natural language is, and has always been, a horrible medium for logical expression.

The system reeks of the AI hype that I thought had died an all-too-painful death by now. Error messages are given in the first person and adopt a conversational tone. The verb "understand" is tossed around without caution. Every language feature has its own little syntax and gotchas. There is no uniform notion of quantification or abstraction. I'm finding it difficult to resist quoting Dijkstra here. See EWD667 for his indictment of the idea.

Underneath the tortured surface, however, there's a brilliant and underexplored programming paradigm. That idea is the use of a full blown database directly as a programming language, instead of merely an adjunct storage device. Certain data invariants are maintained by default, such as if a library is created north of a foyer, the foyer is automatically south of the library, and there is a passage connecting them. I posit that this could dramatically reduce the tedium, for example, of programming GUIs.

I find the execution model dreadfully inappropriate. The more I try to flesh out what I would prefer, the more it looks like a planner. The actor looks up the conditions for an action, attempts to satisfy the conditions, and returns a plan that assumes the conditions will remain satisfied. For example, if I ask Bob the butler to unlock a door in the foyer, but the door is locked and the key is in the library, he must go through the following reasoning before agreeing:

can I open the door (conditional on unlocking the door)
  can I unlock the door (conditional on having the key)
    one of
      do I currently hold the key (no)
      can I get the key (conditional)
        do I know where the key is (yes)
        can I get to where the key is (yes)
        can I take the key, if I am where the key is (yes)
        can I return here, if I am where the key is and have taken the key (yes)
Please excuse the anthropomorphism. The corresponding plan is constructed, scheduled, and carried out. This idea is unfinished, but I've already wasted enough time on it.

jputnam [userpic]

The sum of products and the sum of functions

April 12th, 2009 (01:33 pm)

Let V be a nonempty vector of possibly empty vectors, such that for all i ∈ Dom V and all j ∈ Dom Vi, Vij : ★. Then Σi∈Dom VΠj∈Dom ViVij : ★.

This eliminates the need for axiomatic types, and subsumes the rules for both products and sums. Combined with the function rule, it is obviously powerful enough for polymorphic ADTs, but GADTs are not so obvious. Suppose I want to prove that Vec Bool (S Z) is a type, where Vec is defined as follows:

kind Nat : ** where
  Z : Nat
  S Nat : Nat

data Vec a n : * -> Nat -> * where
  VNil : Vec a Z
  VCons a (Vec a n) : Vec a (S n)
If I instantiate a with Bool and n with (S Z), I get something that is no help at all.
data Vec Bool (S Z) : * where
  VNil : Vec Bool Z
  VCons Bool (Vec Bool (S Z)) : Vec Bool (S (S Z))
In this case, I can instantiate n with Z in the second equation, but a constructor could, in theory, apply a one-way function to the type's parameters. I can think of only two ways to deal with the problem. One is to require both totality and reversibility. Reversibility is obtained by disallowing general type-level functions except where the argument is provided alongside the function, and Haskell does this with associated types. Totality is obtained in the usual way.

But I am not interested in enforcing totality, and reversibility is too much trouble for the programmer, so I need a simpler procedure for acceptance. If I can determine that Vec is a valid function and that Bool and (S Z) are valid parameters, then I can accept Vec Bool (S Z) without bothering to determine whether or not it is inhabited.

This should work: If V is a nonempty vector such that each i ∈ Dom V, Vi : a→b, then λx.ΣVix : a→b. The full rule should handle the double indexing of the arrow, but that's the general idea.

< back | 0 - 10 |