Reading 2: Basic Java
Due the night before class: you must complete the reading exercises and Java Tutor exercises in this reading by Thursday, September 3 at 10:00 pm MIT time. These exercises are graded based on the level of effort you put into them, not on correctness, as described in the course general info.
Getting credit for reading exercises: on the right is a big red log in button. You will only receive credit for reading exercises if you are logged in when you do them. And again, you will need to put care and attention into the reading exercises in order to get full credit.
Due before class: you must complete Problem Set 0 Part I before class on Friday, September 4 at 11:00 am MIT time.
Objectives
Software in 6.031
Getting started with Java
Read the first six sections of From Python to Java (27 short pages):
In the Java Tutor in Eclipse, complete the first two categories: Basic Java and Numbers & Strings.
After you have done these categories, your Java Tutor pane in Eclipse should show the categories fully checked off with a green checkmark:
Then check your understanding by answering some questions about how the basics of Java compare to the basics of Python:
reading exercises
Note that this page doesn’t keep track of the exercises you’ve previously done. When you reload the page, all exercises reset themselves.
If you’re a registered student, you can see which exercises you’ve already done by looking at Omnivore.
Snapshot diagrams
It will be useful for us to draw pictures of what’s happening at runtime, in order to understand subtle questions. Snapshot diagrams represent the internal state of a program at runtime – its stack (methods in progress and their local variables) and its heap (objects that currently exist).
Here’s why we use snapshot diagrams in 6.031:
- To talk to each other through pictures (in class and in team meetings)
- To illustrate concepts like primitive types vs. object types, immutable values vs. unreassignable references, pointer aliasing, stack vs. heap, abstractions vs. concrete representations.
- To pave the way for richer design notations in subsequent courses. For example, snapshot diagrams generalize into object models in 6.170.
Although the diagrams in this course use examples from Java, the notation can be applied to any modern programming language, e.g., Python, JavaScript, C++, Ruby.
The simplest kind of snapshot diagram shows a variable name with an arrow pointing to the variable’s value:
int n = 1;
double x = 3.5;
When the value is an object value (as opposed to a primitive value), it is denoted by a circle labeled by its type:
BigInteger val = new BigInteger("1234567890");
Snapshot diagram syntax is intended to be flexible, not necessarily showing all the details all the time, so that we can draw simple diagrams that focus on particular aspects of the program state that we want to discuss.
For example, the diagrams at the right are all reasonable ways to display a string variable in a snapshot diagram.
String s = "hello";
We may use different diagrams in different contexts.
Sometimes we care about the particular value of s
, and sometimes we don’t.
Sometimes we want to emphasize that a String
is an object value (as opposed to a primitive), and sometimes that isn’t relevant.
When we want to show more detail about an object value, we will write field names inside the object circle, with arrows pointing out to their values.
Point pt = new Point(5, -3);
For still more detail, variables can include their declared types.
Some people prefer to write x:int
instead of int x
, but both are fine.
Mutating values vs. reassigning variables
Snapshot diagrams give us a way to visualize the distinction between changing a variable and changing a value:
When you assign to a variable or a field, you’re changing where the variable’s arrow points. You can point it to a different value.
When you change the contents of a mutable object – such as an array or list – you’re changing references inside that value.
Reassignment and immutable values
For example, if we have a String
variable s
, we can reassign it from a value of "a"
to "ab"
.
String s = "a";
s = s + "b";
String
is an example of an immutable type, a type whose values can never change once they have been created.
Immutability is a major design principle in this course, and we’ll talk much more about it in future readings.
In a snapshot diagram, when we want to emphasize the immutability of an object like String
, we draw it with a double border, as shown in the diagram here.
Mutable values
By contrast, StringBuilder
(another built-in Java class) is a mutable object that represents a string of characters, and it has methods that change the value of the object:
StringBuilder sb = new StringBuilder("a");
sb.append("b");
When we want to emphasize the mutability of an object like StringBuilder
, we draw it with a dashed border, as shown here.
These two snapshot diagrams look very different, which is good: the difference between mutability and immutability will play an important role in making our code safe from bugs.
Unreassignable references
Java also gives us immutability for references: variables that are assigned once and never reassigned.
To make a reference unreassignable, declare it with the keyword final
:
final int n = 5;
In this code, n
can never be reassigned; it will refer to the value 5 for its entire lifetime.
If the Java compiler isn’t convinced that your final
variable will only be assigned once at runtime, then it will produce a compiler error.
So final
gives you static checking for unreassignable references.
In a snapshot diagram, when we want to focus on reassignability, we will use a double-arrow for unreassignable (final
) references, and a dashed arrow for reassignable references.
The diagram at the right shows an object whose id
never changes (it can’t be reassigned to a different number), but whose age
can change.
Note that we can have an unreassignable reference to a mutable value whose value can change even though we’re pointing to the same object:
final StringBuilder sb = new StringBuilder("a");
sb.append("b");
We can also have a reassignable reference to an immutable value where the value of the variable can change because it can be re-pointed to a different object:
String s = "a";
s = "ab";
Double lines and dashed lines help to emphasize the changeability or unchangeability of parts of a snapshot diagram. But when reassignability or mutability is obvious, or not relevant to the discussion, we will keep the diagram simple, and just use single-line arrows and object borders.
reading exercises
Here is a function that mutates and reassigns its parameters:
void f(String s, StringBuilder sb) {
s.concat("b");
s += "c";
sb.append("d");
}
Suppose it is called like this:
String t = "a";
StringBuilder tb = new StringBuilder(t);
f(t, tb);
After this code runs, what sequence of characters does t
refer to?
At the same point, what sequence of characters does tb
refer to?
(missing explanation)
== vs. equals()
Java has two different ways to test equality of values, depending on whether the values are primitives or objects:
- The
==
operator compares the values of primitives. For example,5 == 5
returns true, as does'a' == 'a'
. - The
.equals()
method compares the values of objects. For example,"abc".equals("abc")
returns true.
In Python, which lacks a distinction between primitive and object values, you use ==
for both of these purposes.
This can lead to confusion and mistakes in Java – using the wrong kind of equality for the type you’re trying to compare.
Using equals()
on primitive types is fortunately easy to catch.
5.equals(5)
produces a static error because Java doesn’t allow calling any method on a primitive type.
But the other mistake, using ==
to compare object values for equality, is much more painful, because ==
is overloaded in Java.
When used on object types, ==
tests whether the two expressions refer to the same object in memory.
In terms of the snapshot diagrams we’ve been drawing, two references are ==
if their arrows point to the same object bubble.
In Python, this operator is called is
.
So if the state of the program looks like the figure on the right, then:
- Use
==
for comparing primitive values, like ints, chars, and doubles. - Use
equals()
for comparing object values, like lists, arrays, strings, and other objects.
char
values are primitives, representing exactly one character. Achar
literal is always single-quoted, like'a'
.String
values are objects, represent a string of zero or more characters. AString
literal is double-quoted, like"abc"
and""
.
reading exercises
Here is a function f
that takes several parameters:
void f(String s1, String s2, int i1, int i2, char c1, char c2)
What happens if f
uses the expressions below to compare the values of its parameters for equality?
(missing explanation)
(missing explanation)
(missing explanation)
(missing explanation)
(missing explanation)
(missing explanation)
Java Collections
Read the Collections section of From Python to Java (6 short pages).
Lists, Sets, and Maps
A Java List
is similar to a Python list.
A List
contains an ordered collection of zero or more objects, where the same object might appear multiple times.
We can add and remove items to and from the List
, which will grow and shrink to accomodate its contents.
In a snapshot diagram, we represent a List
as an object with indices drawn as fields:
This list of cities
might represent a trip from Boston to Bogotá to Barcelona.
A Map
is similar to a Python dictionary.
In Python, the keys of a map must be hashable.
Java has a similar requirement that we’ll discuss when we confront how equality works between Java objects.
In a snapshot diagram, we represent a Map
as an object that contains key/value pairs:
This turtles
map contains Turtle
objects assigned to String
keys: Bob, Buckminster, and Buster.
A Set
is an unordered collection of zero or more unique objects.
Like a mathematical set or a Python set – and unlike a List
– an object cannot appear in a set multiple times.
Either it’s in or it’s out.
Like the keys of a map, the objects in a Python set must be hashable, and Java has a similar requirement.
In a snapshot diagram, we represent a Set
as an object with no-name fields:
Here we have a set of integers, in no particular order: 42, 1024, and -7.
Literals
Python provides convenient syntax for creating lists:
lst = [ "a", "b", "c" ]
map = { "apple": 5, "banana": 7 }
Java does not. It does provide a literal syntax for arrays:
String[] arr = { "a", "b", "c" };
But this creates an array, not a List
.
We can use the utility function List.of
to create a List
from arguments:
List.of("a", "b", "c");
A List
created with List.of
comes with an important restriction: it is immutable!
So we can’t add, remove, or replace elements once the list has been created.
Java also provides Set.of
for creating immutable sets and Map.of
for creating immutable maps:
Set.of("a", "b", "c");
Map.of("apple", 5, "banana", 7);
Generics: declaring List, Set, and Map variables
Unlike Python collection types, with Java collections we can restrict the type of objects contained in the collection. When we add an item, the compiler can perform static checking to ensure we only add items of the appropriate type. Then, when we pull out an item, we are guaranteed that its type will be what we expect.
Here’s the syntax for declaring some variables to hold collections:
List<String> cities; // a List of Strings
Set<Integer> numbers; // a Set of Integers
Map<String,Turtle> turtles; // a Map with String keys and Turtle values
Because of the way generics work, we cannot create a collection of primitive types.
For example, Set<int>
does not work.
However, as we saw earlier, int
s have an Integer
wrapper we can use (e.g. Set<Integer> numbers
).
In order to make it easier to use collections of these wrapper types, Java does some automatic conversion.
If we have declared List<Integer> sequence
, this code works:
sequence.add(5); // wrap 5 as an Integer object, and append it to the sequence
int second = sequence.get(1); // get the second Integer element, and unwrap it into an int
ArrayLists and LinkedLists: creating Lists
As we’ll see soon enough, Java helps us distinguish between the specification of a type – what does it do? – and the implementation – what is the code?
List
, Set
, and Map
are all interfaces: they define how these respective types work, but they don’t provide implementation code.
There are several advantages, but one potential advantage is that we, the users of these types, get to choose different implementations in different situations.
Here’s how to create some actual List
s:
List<String> firstNames = new ArrayList<String>();
List<String> lastNames = new LinkedList<String>();
If the generic type parameters are the same on the left and right, Java can infer what’s going on and save us some typing:
List<String> firstNames = new ArrayList<>();
List<String> lastNames = new LinkedList<>();
ArrayList
and LinkedList
are two implementations of List
.
Both provide all the operations of List
, and those operations must work as described in the documentation for List
.
In this example, firstNames
and lastNames
will behave the same way; if we swapped which one used ArrayList
vs. LinkedList
, our code will not break.
Unfortunately, this ability to choose is also a burden: we didn’t care how Python lists worked, why should we care whether our Java lists are ArrayLists
or LinkedLists
?
Since the only difference is performance, for 6.031 we don’t.
If you want to initialize an ArrayList
or LinkedList
, you can give it another collection as an argument:
List<String> firstNames = new ArrayList<>(List.of("Huey", "Dewey", "Louie"));
List<String> lastNames = new ArrayList<>(Set.of("Duck"));
A key difference between List.of
and ArrayList
is mutability.
Where List.of("Huey", "Dewey", "Louie")
produces an immutable list of three strings, new ArrayList<>(List.of("Huey", "Dewey", "Louie"))
produces a mutable list initialized with those strings.
If you need an initialized mutable list, this is simpler than making multiple calls to add()
.
HashSets and HashMaps: creating Sets and Maps
HashSet
is our default choice for Set
s:
Set<Integer> numbers = new HashSet<>();
Java also provides sorted sets with the TreeSet
implementation.
And for a Map
the default choice is HashMap
:
Map<String,Turtle> turtles = new HashMap<>();
Iteration
List<String> cities = new ArrayList<>();
Set<Integer> numbers = new HashSet<>();
Map<String,Turtle> turtles = new HashMap<>();
A very common task is iterating through our cities/numbers/turtles/etc.
In Python:
for city in cities:
print(city)
for num in numbers:
print(num)
for key in turtles:
print("%s: %s" % (key, turtles[key]))
Java provides a similar syntax for iterating over the items in List
s and Set
s.
for (String city : cities) {
System.out.println(city);
}
for (int num : numbers) {
System.out.println(num);
}
Note in the second example above that we declare int num
rather than Integer num
, even though numbers
is a List<Integer>
.
This is because Java automatically converts between int
and Integer
anyway, and using the simpler primitive int
type is preferable to the Integer
object wrapper wherever possible.
For example, int
equality is simpler and less error-prone, as discussed in == vs. equals above.
We can’t iterate over Map
s themselves this way, but we can iterate over the keys as we did in Python:
for (String key : turtles.keySet()) {
System.out.println(key + ": " + turtles.get(key));
}
Under the hood this kind of for
loop uses an Iterator
, a design pattern we’ll see later in the class.
Warning: be careful not to mutate a collection while you’re iterating over it. Adding, removing, or replacing elements disrupts the iteration and can even cause your program to crash. We’ll discuss the reason in more detail in a future class. Note that this warning applies to Python as well. The code below does not do what you might expect:
numbers = [100,200,300]
for num in numbers:
numbers.remove(num) # danger!!! mutates the list we're iterating over
print(numbers) # list should be empty here -- is it?
Iterating with indices
If you want to, Java provides different for
loops that we can use to iterate through a list using its indices:
for (int ii = 0; ii < cities.size(); ii++) {
System.out.println(cities.get(ii));
}
Unless we actually need the index value ii
, this code is verbose and has more places for bugs to hide (starting at 1 instead of 0, using <=
instead of <
, using the wrong variable name for one of the occurrences of ii
, …)
Avoid it if you can.
reading exercises
Java Map
s work like Python dictionaries.
After we run this code:
Map<String, Double> treasures = new HashMap<>();
String x = "palm";
treasures.put("beach", 25.);
treasures.put("palm", 50.);
treasures.put("cove", 75.);
treasures.put("x", 100.);
treasures.put("palm", treasures.get("palm") + treasures.size());
treasures.remove("beach");
double found = 0;
for (double treasure : treasures.values()) {
found += treasure;
}
What is the value of…
(missing explanation)
Enumerations
Sometimes a type has a small, finite set of immutable values, such as:
- months of the year: January, February, …, November, December
- days of the week: Monday, Tuesday, …, Saturday, Sunday
- compass points: north, south, east, west
- available colors: black, gray, red, …
When the set of values is small and finite, it makes sense to define all the values as named constants, which Java calls an enumeration and expresses with the enum
construct.
public enum Month {
JANUARY, FEBRUARY, MARCH, APRIL,
MAY, JUNE, JULY, AUGUST,
SEPTEMBER, OCTOBER, NOVEMBER, DECEMBER;
}
public enum PenColor {
BLACK, GRAY, RED, PINK, ORANGE,
YELLOW, GREEN, CYAN, BLUE, MAGENTA;
}
You can use an enumeration type name like PenColor
in a variable or method declaration:
PenColor drawingColor;
Refer to the values of the enumeration as if they were named static constants:
drawingColor = PenColor.RED;
Note that an enumeration is a distinct new type. Older languages, like Python 2 and early versions of Java, tend to use numeric constants or string literals to represent a finite set of values like this. But an enumeration is more typesafe, because it can catch mistakes like type mismatches:
int month = TUESDAY; // no error if integers are used
Month month = DayOfWeek.TUESDAY; // static error if enumerations are used
String color = "REd"; // no error, misspelling isn't caught
PenColor drawingColor = PenColor.REd; // static error when enumeration value is misspelled
Python 3 has enumerations, similar to Java’s, though not statically type-checked.
Java API documentation
Previous sections in this reading have a number of links to documentation for classes that are part of the Java platform API.
API stands for application programming interface. If you want to program an app that talks to Facebook, Facebook publishes an API (more than one, in fact, for different languages and frameworks) you can program against. The Java API is a large set of generally useful tools for programming pretty much anything.
java.lang.String
is the full name forString
. We can create objects of typeString
just by using"double quotes"
.java.lang.Integer
and the other primitive wrapper classes. Java automagically converts between primitive and wrapped (or “boxed”) types in most situations.java.util.List
is like a Python list, but in Python, lists are part of the language. In Java,List
s are implemented in… Java!java.util.Map
is like a Python dictionary.java.io.File
represents a file on disk. Take a look at the methods provided byFile
: we can test whether the file is readable, delete the file, see when it was last modified…java.io.FileReader
lets us read text files.java.io.BufferedReader
lets us read in text efficiently, and it also provides a very useful feature: reading an entire line at a time.
Let’s take a closer look at the documentation for BufferedReader
.
There are many things here that relate to features of Java we haven’t discussed!
Keep your head, and focus on the things in bold below.
At the top right corner is the search box. You can use it to jump to a class or interface, or straight to a particular method.
Under the Class BufferedReader heading is the class hierarchy for BufferedReader
and a list of implemented interfaces.
A BufferedReader
object has all of the methods of all those types (plus its own methods) available to use.
Next we see direct subclasses, and for an interface, implementing classes.
This can help us find, for example, that HashMap
is an implementation of Map
.
Next up: a description of the class. Sometimes these descriptions are a little obtuse, but this is the first place you should go to understand a class.
If you want to make a new BufferedReader
the constructor summary is the first place to look.
Constructors aren’t the only way to get a new object in Java, but they are the most common.
Next: the method summary lists all the methods we can call on a BufferedReader
object.
Below the summary are detailed descriptions of each method and constructor. Click a constructor or method to see the detailed description. This is the first place you should go to understand what a method does.
Each detailed description includes:
Specifications
These detailed descriptions are specifications.
They allow us to use tools like String
, Map
, or BufferedReader
without having to read or understand the code that implements them.
Reading, writing, understanding, and analyzing specifications will be one of our first major undertakings in 6.031, starting in a few classes.
reading exercises
Use the Java API docs to answer…
Suppose we have a class TreasureChest
.
After we run this code:
Map<String, TreasureChest> treasures = new HashMap<>();
treasures.put("beach", new TreasureChest(25));
TreasureChest result = treasures.putIfAbsent("beach", new TreasureChest(75));
(missing explanation)
Reading exercises
At this point you should have completed all the reading exercises above. Completing the reading exercises prepares you for the nanoquiz at the beginning of each class meeting. To check your reading exercise status, see classes/02-basic-java on Omnivore. Submitting the exercises is required by 10:00pm MIT time the evening before class.
At this point you should have also completed the Java Tutor levels shown below. Java Tutor exercises are also due 10:00pm MIT time the evening before class.