Zero-one Schubert polynomials

We prove that if $\sigma \in S_m$ is a pattern of $w \in S_n$, then we can express the Schubert polynomial $\mathfrak{S}_w$ as a monomial times $\mathfrak{S}_\sigma$ (in reindexed variables) plus a polynomial with nonnegative coefficients. This implies that the set of permutations whose Schubert polynomials have all their coefficients equal to either 0 or 1 is closed under pattern containment. Using Magyar's orthodontia, we characterize this class by a list of twelve avoided patterns. We also give other equivalent conditions on $\mathfrak{S}_w$ being zero-one. In this case, the Schubert polynomial $\mathfrak{S}_w$ is equal to the integer point transform of a generalized permutahedron.


Introduction
Schubert polynomials, introduced by Lascoux and Schützenberger in [10], represent cohomology classes of Schubert cycles in the flag variety. Knutson and Miller also showed them to be multidegrees of matrix Schubert varieties [7]. There are a number of combinatorial formulas for the Schubert polynomials [1,2,5,6,9,12,14,17], yet only recently has the structure of their supports been investigated: the support of a Schubert polynomial S w is the set of all integer points of a certain generalized permutahedron P(w) [4,15]. The question motivating this paper is to characterize when S w equals the integer point transform of P(w), in other words, when all the coefficients of S w are equal to 0 or 1.
One of our main results is a pattern-avoidance characterization of the permutations corresponding to these polynomials: In Theorem 4.8 we provide further equivalent conditions on the Schubert polynomial S w being zero-one. One implication of Theorem 1.1 follows from our other main result, which relates the Schubert polynomials S σ and S w when σ is a pattern of w: Theorem 1.2. Fix w ∈ S n and let σ ∈ S n−1 be the pattern with Rothe diagram D(σ) obtained by removing row k and column w k from D(w). Then S w (x 1 , . . . , x n ) = M (x 1 , . . . , x n )S σ (x 1 , . . . , x k , . . . , x n ) + F (x 1 , . . . , x n ), (1) where F ∈ Z ≥0 [x 1 , . . . , x n ] and Theorem 1.2 is a special case of Theorem 5.1, which applies to the dual character of the flagged Weyl module of any diagram.
Outline of this paper. Section 2 gives an expression of Magyar for Schubert polynomials in terms of orthodontic sequences (i, m). In Section 3, we give a condition "multiplicity-free" on the orthodontic sequence (i, m) of w which implies that S w is zero-one. In Section 4 we show that multiplicity-freeness can equivalently be phrased in terms of pattern avoidance. We then prove in Section 4 that multiplicity-freeness is also a necessary condition for S w to be zero-one. In the latter proof we assume Theorem 1.2, whose generalization (Theorem 5.1) and proof is the subject of Section 5.

Magyar's orthodontia for Schubert polynomials
In this section we explain the results we use to show one direction of Theorem 1.1. We include the classical definition of Schubert polynomials here for reference.
For w ∈ S n , w = w 0 , there exists i ∈ [n − 1] such that w(i) < w(i + 1). For any such i, the Schubert polynomial S w is defined as S w (x 1 , . . . , x n ) := ∂ i S wsi (x 1 , . . . , x n ), where s i is the transposition swapping i and i + 1, and ∂ i is the ith divided difference operator Since the operators ∂ i satisfy the braid relations, the Schubert polynomials S w are well-defined. We will not be using the above definition of Schubert polynomials in this work. Instead, we will make use of several results due to Magyar in [13]. We start by summarizing Proposition 15 and Proposition 16 of [13] and supplying the necessary background, closely following the exposition of [13].
By a diagram, we mean a sequence D = (C 1 , C 2 , . . . , C n ) of finite subsets of [n], called the columns of D. We think of D as a collection of boxes (i, j) in a grid, viewing an element i ∈ C j as a box in row i and column j of the grid. When we draw diagrams, we read the indices as in a matrix: i increases top-to-bottom and j increases left-to-right. Two diagrams D and D are called column-equivalent if one is obtained from the other by reordering nonempty columns and adding or removing any number of empty columns. For a column C ⊆ [n], let the multiplicity mult D (C) be the number of columns of D which are equal to C. The sum of diagrams, denoted D ⊕ D , is constructed by concatenating the lists of columns; graphically this means placing D to the right of D.
The Rothe diagram D(w) of a permutation w ∈ S n is the diagram Note that Rothe diagrams have the northwest property: If (r, c ), (r , c) ∈ D(w) with r < r and c < c , then (r, c) ∈ D(w). We next recall Magyar's orthodontia. Let D be the Rothe diagram of a permutation w ∈ S n with columns C 1 , C 2 , . . . , C n . We describe an algorithm for constructing a reduced word i = (i 1 , . . . , i l ) and a multiplicity list m = (k 1 , . . . , k n ; m 1 , . . . , m l ) such that the diagram D i,m defined by is column-equivalent to D. In the above, m · C denotes C ⊕ · · · ⊕ C with m copies of C; in particular 0 · C should be interpreted as a diagram with no columns, not the empty column.
The algorithm to produce i and m from D is as follows. To begin the first step, for each j ∈ [n] let k j = mult D ([j]), the number of columns of D of the form [j]. Replace all such columns by empty columns for each j to get a new diagram D − .
Given a column C ⊆ [n], a missing tooth of C is a positive integer i such that i / ∈ C, but i + 1 ∈ C. The only columns without missing teeth are the empty column and the intervals [i]. Hence the first nonempty column of D − (if there is any) contains a smallest missing tooth i 1 . Switch rows i 1 and i 1 + 1 of D − to get a new diagram D .
In the second step, repeat the above with D in place of D. That is, let m 1 = mult D ([i 1 ]) and replace all columns of the form [i 1 ] in D by empty columns to get a new diagram D − . Find the smallest missing tooth i 2 of the first nonempty column of D − , and switch rows i 2 and i 2 + 1 of D − to get a new diagram D .
Continue in this fashion until no nonempty columns remain. It is easily seen that the sequences i = (i 1 , . . . , i l ) and m = (k 1 , . . . , k n ; m 1 , . . . , m l ) just constructed have the desired properties.
Definition 2.2. The pair (i, m) constructed from the preceding algorithm is called the orthodontic sequence of w. .
Given a permutation w ∈ S n with orthodontic sequence (i, m), we will define a set T w of fillings of the diagram D i,m which satisfy We start by recalling the root operators, first defined in [11]. These are operators f i which either take a filling T of a diagram D to another filling of D, or are undefined on T . To define root operators, we first encode a filling T in terms of its reading word. The reading word of a filling T of a diagram D = D i,m is the sequence of the entries of T read in order, down each column going left-to-right along columns; that is the sequence T (1, 1), T (2, 1), . . . , T (n, 1), T (1, 2), T (2, 2), . . . , T (n, 2), . . . , T (n, n) ignoring any boxes (i, j) / ∈ D. If it is defined, the operator f i changes an entry of i in T to an entry of i + 1 according to the following rule. First, ignore all the entries in T except those which equal i or i + 1. Now "match parentheses": if, in the list of entries not yet ignored, an i is followed by an i + 1, then henceforth ignore that pair of entries as well; look again for an i followed (up to ignored entries) by an i + 1, and henceforth ignore this pair; continue doing this until all no such pairs remain unignored. The remaining entries of T will be a subword of the form i + 1, i + 1, . . . , i + 1, i, i, . . . , i. If i does not appear in this word, then f i (T ) is undefined. Otherwise, f i changes the leftmost i to an i + 1. Reading the image word back into D produces a new filling. We can iteratively apply f i to a filling T . For example: T = 3 1 2 2 2 1 3 1 2 4 3 2 4 1 3 1 = · 1 2 2 2 1 · 1 2 · · 2 · 1 · 1 = · · · 2 2 1 · · · · · 2 · 1 · 1 = · · · 2 2 · · · · · · · · 1 · 1 f 1 (T ) = 3 1 2 2 2 1 3 1 2 4 3 2 4 2 3 1 f 2 1 (T ) = 3 1 2 2 2 1 3 1 2 4 3 2 4 2 3 2 Next, consider the column [j] and its minimal column-strict filling ω j (jth row maps to j). For a filling T of a diagram D with columns (C 1 , C 2 , . . . , C n ), define in the obvious way the composite filling of [j] ⊕ D, corresponding to concatenating the reading words of [j] and D. Define [j] r ⊕ D analogously by adding r columns [j] to D, each with filling ω j .
Definition 2.6. Let w ∈ S n be a permutation with orthodontic sequence (i, m). Define the set T w of tableaux by ). Theorem 2.7 ([13, Proposition 16]). Let w ∈ S n be a permutation with orthodontic sequence (i, m). Then, Example 2.8. Consider again w = 31542, so the orthodontic sequence of w is i = (2, 3, 1) and m = (1, 0, 0, 0, 0; 0, 1, 1). The set T w is built up as follows: . We now describe a way to view each step of the construction of T w as producing a set of fillings of a diagram. Definition 2.9. Let w be a permutation with orthodontic sequence (i, m), i = (i 1 , . . . , i l ). For each r ∈ [l], define Definition 2.10. Let w be a permutation with orthodontic sequence (i, m), i = (i 1 , . . . , i l ). For any r ∈ [l], let O(w, r) be the diagram obtained from D(w) in the construction of (i, m) at the time when the row swaps of the missing teeth i 1 , . . . , i r have all been executed on D(w), but after executing the row swap of the missing tooth i r , columns without missing teeth have not yet been removed (m r has not yet been recorded). Set O(w, 0) = D(w). For each r, give O(w, r) the same column indexing as D(w), so any columns replaced by empty columns in the execution of the missing teeth i 1 , . . . , i r−1 retain their original index in D(w).
The motivation behind Definitions 2.9 and 2.10 is that the elements of T w (r) can be viewed as columnstrict fillings of O(w, r) for each r. To do this, the choice of filling order for O(w, r) is crucial. Let w ∈ S n and consider D = D(w) and D i,m . Suppose D has z nonempty columns. There is a unique permutation τ of [n] taking the column indices of D to the column indices of D i,m ⊕ ∅ n−z with the following properties: • Column c of D is the same as column τ (c) of D i,m , • If column c and column c of D are equal with c < c , then τ (c) < τ (c ). Recall that the columns of O(w, r) have the same column labels as D. To read an element T ∈ T w (r) into O(w, r), read T left-to-right and fill in top-to-bottom columns τ −1 (n), τ −1 (n − 1), . . . , τ −1 (1) (ignoring any column indices referring to empty columns).
Lemma 2.11. Let w ∈ S n have orthodontic sequence (i, m), i = (i 1 , . . . , i l ). In the filling order specified above, the elements of T w (r) are column-strict fillings of O(w, r) for each 0 ≤ r ≤ l. , so τ = 12435 = τ −1 . Consider the elements 1 ∈ T w (3), 1232 ∈ T w (2), 1242 ∈ T w (1), and 11342 ∈ T w (0). The column filling order of each O(w, r) is given by reading τ −1 in one-line notation right to left: in the indexing of D(w), fill down column 4, then down column 2, then down column 1. The elements of each set T w (r) are column-strict fillings in the corresponding diagrams O(w, r): Lemma 2.13. Let w be a permutation with orthodontic sequence (i, m), i = (i 1 , . . . , i l ). For each 0 ≤ r ≤ l, O(w, r) has the northwest property.
Lemma 2.15. For each 0 ≤ r ≤ l, the elements of T w (r) are row-flagged fillings of O(w, r).
Proof. Clearly, the singleton T w (l) is a row-flagged filling of O(w, l). Assume that for some l ≥ s > 0, the result holds with r = s. We show that the result also holds with

Zero-one Schubert polynomials
This section is devoted to giving a sufficient condition on the orthodontic sequence (i, m) of w for the Schubert polynomial S w to be zero-one. We give such a condition in Theorem 3.6. We will see in Theorem 4.8 that this condition turns out to also be a necessary condition for S w to be zero-one.
We start with a less ambitious result: Proposition 3.1. Let w ∈ S n and (i, m) be the orthodontic sequence of w. If i = (i 1 , . . . , i l ) has distinct entries, then S w is zero-one.
Proof. Let T, T ∈ T w with wt(T ) = wt(T ). By Definition 2.6, we can find p 1 , . . . , p l so that . Then if e 1 , . . . , e n denote the standard basis vectors of R n , Similarly, we can find q 1 , . . . , q l so that As wt(T ) = wt(T ), Since the vectors {e ij +1 − e ij } l j=1 are independent and i has distinct entries, p j = q j for all j. Thus T = T . This shows that all elements of T w have distinct weights, so S w is zero-one.
We now strengthen Proposition 3.1 to allow i to not have distinct entries. To do this, we will need a technical definition related to the orthodontic sequence. Recall the construction of the orthodontic sequence (i, m) of a permutation w ∈ S n (Definition 2. That is, I w (j) is the set of indices of columns of O(w, j − 1) − that are changed when rows i j and i j + 1 are swapped to form O(w, j).  The proof of the generalization of Proposition 3.1 will require the following technical lemma. Before proceeding, recall Lemmas 2.11 and 2.15: for every 0 ≤ j ≤ l, elements of T w (j) can be viewed as rowflagged, column-strict fillings of O(w, j) (via the column filling order of O(w, j) specified prior to Lemma 2.11). Applying ω mj−1 ij−1 ⊕ f ij to an element of T w (j) gives an element of T w (j − 1), a filling of O(w, j − 1). Thus, when we speak below of the application of f ij to an element T ∈ T w (j) "changing an i j to an i j + 1 in column c", we specifically mean that when we view T as a filling of O(w, j) and ω  Suppose i r = i s with r < s and I w (r) = I w (s) = {c}. Then for each j with r ≤ j ≤ s, I w (j) = {c} and the application of f ij to an element of T w (j) is either undefined or changes an i j to an i j+1 in column c.
Proof. We handle first the case that j = r. In the diagram O(w, r − 1), column c is the leftmost column containing a missing tooth, and i r is the smallest missing tooth in column c. Reading column c of O(w, r − 1) top-to-bottom, one sees a (possibly empty) sequence of boxes in O(w, r − 1), followed by a sequence of boxes not in O(w, r − 1). The sequence of boxes not in O(w, r − 1) has length at least two since i r occurs at least twice in i, and terminates with the box (i r +1, Lastly, observe that for any c > c and d > i r + 1, there can be no box (d, c ) ∈ O(w, r − 1). Otherwise there would be some t ∈ [l] with i t = i r and t = r such that c ∈ I w (t), violating that w is multiplicity-free.
As a consequence of the previous observations, the largest row index that a column c > c of O(w, r − 1) can contain a box in is i r − 2. In particular, Lemma 2.15 implies that the application of f ir to an element of T w (r) either is undefined or changes an i r to an i r + 1 in column c. This concludes the case that j = r.
When j = s, an entirely analogous argument works. The only significant difference in the observations is that when column c of O(w, s − 1) is read top-to-bottom, the (possibly empty) initial sequence of boxes in O(w, s − 1) is followed by a sequence of boxes not in O(w, s − 1) with length at least 1, ending with the box (i s + 1, c). Consequently, the largest row index that a column c > c of O(w, s − 1) can contain a box in is i s − 1. In particular, Lemma 2.15 implies that the application of f is to an element of T w (s) either is undefined or changes an i s to an i s + 1 in column c. This concludes the case that j = s. Now, let r < j < s. Since I w (r) = I w (s) = {c}, we have c ∈ I w (j). If i j occurs multiple times in i, then multiplicity-freeness of w implies I w (j) = {c}. In this case, we can find j = j with i j = i j and apply the previous argument (with r and s replaced by j and j ) to conclude that the application of f ij to an element of T w (j) is either undefined or changes an i j to an i j + 1 in column c.
Thus, we assume i j occurs only once in i. Recall that it was shown above that O(w, r − 1) has no boxes (d, c ) with d > i r and c > c. Read top-to-bottom, let column c of O(w, r − 1) have a (possibly empty) initial sequence of boxes ending with a missing box in row u, so clearly u ≤ i r − 1. Since the first missing tooth in column c of O(w, r − 1) is in row i r , none of the boxes (u, c), (u + 1, c), . . . , (i r , c) are in O(w, r − 1), but (i r + 1, c) ∈ O(w, r − 1). Then, the northwest property implies that there is no box in O(w, r − 1) in any column c > c in any of rows u, u + 1, . . . , i r . In particular, the largest row index such that a column c > c of O(w, r − 1) can contain a box in is u − 1.
As r < j < s and I w (r) = I w (s) = {c}, we have that c ∈ I w (j). Also since r < j < s, the leftmost nonempty column in O(w, j − 1) is column c, and i j ≥ u. Then in O(w, j − 1), the maximum row index a box in a column c > c can have is u − 1. In particular, I w (j) = {c}, and Lemma 2.15 implies that the application of f ij to an element of T w (j) is either undefined or changes an i j to an i j + 1 in column c.
Theorem 3.6. If w is multiplicity-free, then S w is zero-one.
Proof. Assume wt(T ) = wt(T ) for some T, T ∈ T w . If we can show that T = T , then we can conclude that all elements of T w have distinct weights, so S w is zero-one. To begin, write , for some p 1 , . . . , p l , q 1 , . . . , q l . The basic idea of the proof is to show that as T and T are constructed step-by-step from ω m l i l , the resulting intermediate tableaux are intermittently equal. At termination of the construction, this will imply that T = T .
By the expansion ( * ) of wt(T ) − wt(T ) used in the proof of Proposition 3.1, we observe that p u = q u for all u such that i u occurs only once in i. Let s be the largest index such that p s = q s . Suppose I w (s) = {c}. Let r 1 be the smallest index such that i r1 occurs multiple times in i and I w (r 1 ) = {c}. We know r 1 < s, because ( * ) implies that p s = q s for another s < s with i s = i s , and by multiplicity-freeness I w (s ) = {c}. We wish to find an interval [r, s] ⊆ [r 1 , s] such that r < s and (i) If v ≥ r and i v occurs multiple times in i, then any other v with i v = i v will satisfy v ≥ r, (ii) For every j with r < j < s and i j occurring only once in i, there are t and u with r ≤ t < j < u ≤ s such that i t = i u . We first show that (i) holds for [r 1 , s]. Note that if i v occurs multiple times in i and r 1 ≤ v ≤ s, then it must be that I w (v) = {c} by the fact that the orthodontia construction records all missing teeth needed to eliminate one column before moving on to the next column. If i v = i v , then I w (v ) = {c} also, by multiplicity-freeness of w. The choice of r 1 implies r 1 ≤ v . If i v occurs multiple times in i with s < v and I w (v) = {c}, then the choice of r 1 again implies that r 1 ≤ v for any i v = i v . If i v occurs multiple times in i with s < v and I w (v) = {c}, then the orthodontia construction implies that any v with i v = i v must satisfy s < v . In particular, r 1 < v as needed. Thus, (i) holds for [r 1 , s]. If [r 1 , s] also satisfies (ii), then we are done.
Otherwise, assume [r 1 , s] does not satisfy (ii). Then there is some j with r 1 < j < s such that i j occurs only once in i and there are no t and u with r 1 ≤ t < j < u ≤ s and i t = i u . Consequently for every pair i u = i t with r 1 ≤ t < u ≤ s, it must be that either t < u < j or j < t < u. Let r 2 be the smallest index such that j < r 2 and i r2 occurs multiple times in i. By the choice of j, it is clear that the interval [r 2 , s] still satisfies (i). If [r 2 , s] also satisfies (ii), then we are done.
Otherwise, [r 2 , s] satisfies (i) but not (ii), and we can argue exactly as in the case of [r 1 , s] to find an r 3 such that r 2 < r 3 < s and [r 3 , s] satisfies (i). Continue working in this fashion. We show that this process terminates with an interval [r, s] satisfying r < s, (i), and (ii).
As mentioned above, there exists s < s such that i s = i s . Let s be the maximal index less than s with this property. Since all of the intervals [r * , s] will satisfy (i), it follows that r 1 < r 2 < · · · ≤ s . At worst, the process will terminate after finitely many steps with the interval [s , s]. The interval [s , s] will then satisfy (i) since the process reached it, and will trivially satisfy (ii) since i s = i s .
Hence, we can assume that we have found an interval [r, s] satisfying r < s, (i), and (ii). Consider the tableaux . By definition, T r , T r ∈ T w (r − 1), so we can view T r and T r as fillings of O(w, r − 1). Similarly, T s , T s ∈ T w (s), so we can view T s and T s as fillings of O(w, s). Since we chose s to be the largest index such that p s = q s , it follows that T s = T s . By property (i) of [r, s], i u = i v for any u < r ≤ v. Hence, it must be that wt(T r ) = wt(T r ). Finally, property (ii) of [r, s] allows us to apply Lemma 3.5 and conclude that for any a r , a r+1 , . . . , a s ≥ 0, when ω mr−1 ir−1 ⊕ f ar ir ( ω ir mr ⊕ · · · ⊕ ω ms−1 is−1 f as is (-) · · · ) is applied to an element of T w (s), only the entries in column c are affected by the root operators f ar ir , . . . , f as is . Since T r = ω mr−1 ir−1 ⊕ f pr ir ( ω ir mr ⊕ · · · ⊕ ω ms−1 is−1 f ps is (T s ) · · · ) and T r = ω mr−1 ir−1 ⊕ f qr ir ( ω ir mr ⊕ · · · ⊕ ω ms−1 is−1 f qs is (T s ) · · · ), T r and T r must coincide outside of column c. Since we already deduced that wt(T r ) = wt(T r ), it follows that column c of T r and T r have the same weight. By column-strictness of T r and T r , column c of T r and T r must coincide, so T r = T r .
To complete the proof, letŝ be the largest indexŝ < r such that pŝ = qŝ. If no such index exists, then T = T . Otherwise, setr 1 to be the smallest index such that ir 1 occurs multiple times in i and I w (r 1 ) = I w (ŝ). We haver 1 <ŝ because some otherŝ distinct fromŝ such that pŝ = qŝ and iŝ = iŝ must exist as before, andŝ is also less than r by property (i) of [r, s]. Use the previous algorithm to find an interval [r,ŝ] ⊆ [r 1 ,ŝ] satisfyingr <ŝ, (i), and (ii). Construct Tr, T r , Tŝ, T ŝ , and argue exactly as in the case of [r, s] that Tr = T r .
Continuing in this manner for a finite number of steps will show that T = T .
As we will show in Theorem 4.8, it is not only sufficient but also necessary that w be multiplicity-free for the Schubert polynomial S w to be zero-one.

Pattern avoidance conditions for multiplicity-freeness
This section is devoted to showing that w being multiplicity-free is equivalent to a certain pattern avoidance condition. We then prove our full characterization of zero-one Schubert polynomials.
Given a Rothe diagram D(w), we will call a tuple (r 1 , c 1 , r 2 , c 2 , r 3 ) meeting the conditions of Definition 4.1 an instance of configuration A in D(w). Similarly, we will call a tuple (r 1 , c 1 , r 2 , c 2 , r 3 , r 4 ) meeting the conditions of Definition 4.2 (resp. 4.3) an instance of configuration B (resp. B ) in D(w).   Proof. We prove the contrapositive. Assume w is not multiplicity-free and let (i, m) be the orthodontic sequence of w. Then, we can find entries i p1 = i p2 of i with p 1 < p 2 such that either I w (p 1 ) = I w (p 2 ), or I w (p 1 ) = I w (p 2 ) with |I w (p 1 )| > 1. We show that D(w) must contain at least one instance of configuration A, B, or B .
Case 1: Assume the symmetric difference of I w (p 1 ) and I w (p 2 ) is nonempty. Take c 1 ∈ I w (p 1 )\I w (p 2 ) and c 2 ∈ I w (p 2 )\I w (p 1 ). We show that columns c 1 and c 2 of D(w) contain an instance of configuration A. In the step p 1 of the orthodontia on D(w), a box in column c 1 of is moved (by the missing tooth i p1 ) to row i p1 . Let this box originally be in row r 1 of D(w). Analogously, let the box in column c 2 moved to row i p2 in step p 2 of the orthodontia (by the missing tooth i p2 ) originally be in row r 2 of D(w). Observe that r 1 < r 2 . If c 2 < c 1 , then the northwest property would imply that (r 1 , c 2 ) ∈ D(w), contradicting that c 2 / ∈ I w (p 1 ). Thus c 1 < c 2 . Since c 2 / ∈ I w (p 1 ), (r 1 , c 2 ) / ∈ D(w). Lastly, since the box (r 1 , c 1 ) is moved by the orthodontia, there is some box (r 3 , c 1 ) / ∈ D(w) with r 3 < r 1 . Consequently, w r3 < c 1 . Thus, (r 1 , c 1 , r 2 , c 2 , r 3 ) is an instance of configuration A.
Case 2: Assume I w (p 2 ) is a proper subset of I w (p 1 ). Let c 1 = max(I w (p 2 )) and c 2 = min(I w (p 1 )\I w (p 2 )). Let the box in column c 1 moved to row i p1 = i p2 in step p 1 (resp. p 2 ) of the orthodontia originally be in row r 1 (resp. r 2 ) of D(w). Observe that r 1 < r 2 .
Case 3: Assume I w (p 1 ) is a proper subset of I w (p 2 ). This case is handled similarly to Case 2. Let c 1 = max(I w (p 1 )) and c 2 = min(I w (p 2 )\I w (p 1 )). Let the box in column c 1 moved to row i p1 = i p2 in step p 1 (resp. p 2 ) of the orthodontia originally be in row r 1 (resp. r 2 ) of D(w). Observe that r 1 < r 2 .
Case 4: Assume I w (p 1 ) = I w (p 2 ) is not a singleton. Let c 1 , c 2 ∈ I w (p 1 ) with c 1 < c 2 . Let the box in column c 1 moved to row i p1 = i p2 in step p 1 (resp. p 2 ) of the orthodontia originally be in row r 1 (resp. r 2 ) of D(w). Observe that r 1 < r 2 . Since the boxes (r 1 , c 1 ) and (r 2 , c 1 ) in D(w) are moved weakly above row i p1 by the orthodontia, we can find indices r 3 , r 4 with r 4 < r 3 < r 1 such that (r 3 , c 1 ), (r 4 , c 1 ) / ∈ D(w). Then, w r3 < c 1 and w r4 < c 1 . Thus, (r 1 , c 1 , r 2 , c 2 , r 3 , r 4 ) is an instance of configuration B .
We now relate multiplicity-freeness to pattern avoidance of permutations. We begin by clarifying our pattern avoidance terminology. A pattern σ of length n is a permutation in S n . The length n is a crucial part of the data of a pattern; we make no identifications between patterns of different lengths, unlike what is usual when handling permutations in the Schubert calculus. A permutation w contains σ if w has n entries w j1 , . . . , w jn with j 1 < j 2 < · · · < j n that are in the same relative order as σ 1 , σ 2 , . . . , σ n . In this case, the indices j 1 < j 2 < · · · < j n are called a realization of σ in w. We say that w avoids the pattern σ if w does not contain σ. To illustrate the dependence of these definitions on n, note that w = 154623 contains the pattern 132, but not the pattern 132456.
The following easy lemma gives a diagrammatic interpretation of pattern avoidance.   Proof. It is easy to check (see Figure 2) that each of the twelve multiplicitous patterns contains an instance of configuration A, B, or B . Lemma 4.5 implies that if w contains σ ∈ MPatt, then deleting some rows and columns from D(w) yields D(σ). Since D(σ) contains at least one instance of configuration A, B, or B , so does D(w). Conversely, assume D(w) contains at least one instance of configuration A, B, or B . We must show that w contains some multiplicitous pattern. Let τ 1 , τ 2 , . . . , τ n be the n patterns of length n−1 contained in w; say τ j is realized in w by forgetting w j . Without loss of generality, we may assume none of D(τ 1 ), . . . , D(τ n ) contain an instance of configuration A, B, or B : if D(τ j ) does contain an instance of one of these configurations, replace w by τ j and iterate.
For each j, D(τ j ) is obtained from D(w) by deleting row j and column w j . Since D(τ j ) does not contain any instance of any of our three configurations, each cross {(j, q) | (j, q) ∈ D(w)} ∪ {(p, w j ) | (p, w j ) ∈ D(w)} intersects each instance of every configuration contained in D(w). However, an instance of configuration A involves only three rows and two columns, and an instance of B or B involves only four rows and two columns. Thus, it must be that w ∈ S n for some n ≤ 6. It can be checked by exhaustion that the only permutations in S n with n ≤ 6 that are minimal (with respect to pattern avoidance) among those whose Rothe diagrams contain an instance of configuration A, B, or B are the twelve multiplicitous patterns.
We are now ready to state our full characterization of zero-one Schubert polynomials, and most of the elements of the proof are at hand. Proof. Theorem 3.6 shows (ii) ⇒ (i). Theorem 4.4 shows (iii) ⇒ (ii). Theorem 4.7 shows (iii) ⇔ (iv). The implication (i) ⇒ (iv) will follow immediately from Corollary 5.9, since the Schubert polynomials associated to the permutations 12543, 13254, 13524, 13542, 21543, 125364, 125634, 215364, 215634, 315264, 315624, and 315642 each have a coefficient equal to 2. We prove Corollary 5.9 in the next section.

A coefficient-wise inequality for dual characters of flagged Weyl modules of diagrams
The aim of this section is to prove a generalization of Theorem 1.2, namely, Theorem 5.1: and let D be the diagram obtained from D by removing any boxes in row k or column l. Then We now explain the necessary background and terminology for Theorem 5.1 and its proof.
Let G = GL(n, C) be the group of n × n invertible matrices over C and B be the subgroup of G consisting of the n × n upper-triangular matrices. The flagged Weyl module is a representation M D of B associated to a diagram D. The dual character of M D has been shown in certain cases to be a Schubert polynomial [8] or a key polynomial [16]. We will use the construction of M D in terms of determinants given in [13].
Denote Note that since Y is upper-triangular, the condition C ≤ D is technically unnecessary since det Y Cj Dj = 0 For any B-module N , the character of N is defined by char(N )(x 1 , . . . , x n ) = tr (X : N → N ) where X is the diagonal matrix diag(x 1 , x 2 , . . . , x n ) with diagonal entries x 1 , . . . , x n , and X is viewed as a linear map from N to N via the B-action. Define the dual character of N to be the character of the dual module N * : Another special family of dual characters of flagged Weyl modules of diagrams, for so-called skyline diagrams of compositions, are key polynomials [3]. We now work towards proving Theorem 5.1. We start by reviewing some material from [4] for the reader's convenience. We then derive several lemmas that simplify the proof of Theorem 5.1.  The understanding of the coefficients of the monomials of χ D given in Corollary 5.6 is key to our proof of Theorem 5.1. We set up some notation now.
Given diagrams C, D ⊆ [n] × [n] and k, l ∈ [n], let C and D denote the diagrams obtained from C and D by removing any boxes in row k or column l. Fix a diagram D. For each diagram C, let The following lemma is immediate and its proof is left to the reader.
are linearly dependent, then so are the polynomials . Proof. We are given that for some constants (c i ) i∈[m] ∈ C m not all zero. Since C (i) = C Combining (4) and (6)  are linearly dependent, as desired. Now, suppose that there are boxes of D in row k that are not in D l . Let i 1 < . . . < i p be all indices j = l such that D j = D j ∪ {k}. Then also C (i) . Let us consider the left-hand side of (4) as a polynomial in y kk . Then, (4) as desired.
We now prove Theorem 5.1 and Theorem 1.2, both of which we restate here for convenience. Theorem 1.2. Fix w ∈ S n and let σ ∈ S n−1 be the pattern with Rothe diagram D(σ) obtained by removing row k and column w k from D(w). Then S w (x 1 , . . . , x n ) = M (x 1 , . . . , x n )S σ (x 1 , . . . , x k , . . . , x n ) + F (x 1 , . . . , x n ), (11) where F ∈ Z ≥0 [x 1 , . . . , Proof. Specialize Theorem 5.1 to the case that D is a Rothe diagram D(w) and l = w k . The dropping of x k is due to reindexing, since the entirety of row k and column w k of D(w) are removed from to obtain D(σ), not just the boxes in row k and column w k . Corollary 5.9. Fix w ∈ S n and let σ ∈ S m be any pattern contained in w. If k is a coefficient of a monomial in S σ , then S w contains a monomial with coefficient at least k.