Retrofit網絡框架使用中遇到的一個Url編碼的坑

Retrofit 是一個用於 Android 和 Java 平臺的類型安全的網絡請求框架,項目的網絡模塊是使用Retrofit框架封裝的;html

開發過程當中,使用了第三方的sdk,要將第三方輸出的數據上傳給服務端,可是sdk輸出的+通過Url encode傳輸給服務端,服務端url decode以後就成了空格。。。java

在線對輸出的數據進行url 編解碼沒有問題,最終編解碼結果都是和原始數據相同,因此開始排查Retrofit底層對url encode的實現git

背景

代碼中使用Retrofit框架封裝的網絡請求以下:github

@FormUrlEncoded
@POST("user/upload")
Observable<Response<Boolean>> upload(@Field(value = "data", encoded = true) String data);

原覺得encoded=true表示要進行url encode,本着逐個排除法先encoded設置爲false,這下應該就不進行url encode了嘛,結果發現改爲false以後,服務端返回成功了,什麼?將網絡請求打印發現,url的body仍然進行編碼了,這是爲何呢安全

而後將encoded=true,encoded=false狀況下的數據打印出來,發現除了+的編碼不一樣,其餘均相同,網絡搜索沒有找到緣由,只能一步步去查看Retrofit源碼了網絡

//原始數據
/gAAAAAAAAD+AAAAAAAAAP4AAAAAAAAA

//encoded = true
%2FgAAAAAAAAD+AAAAAAAAAP4AAAAAAAAA

//encoded = false
%2FgAAAAAAAAD%2BAAAAAAAAAP4AAAAAAAAA

根據源碼調查緣由

Retrofit,註解FormUrlEncoded將網絡請求自動設置成application/x-www-form-urlencoded,對註解的Field數據進行Url 編碼app

/**
 * Denotes that the request body will use form URL encoding. Fields should be declared as
 * parameters and annotated with {@link Field @Field}.
 * <p>
 * Requests made with this annotation will have {@code application/x-www-form-urlencoded} MIME
 * type. Field names and values will be UTF-8 encoded before being URI-encoded in accordance to
 * <a href="http://tools.ietf.org/html/rfc3986">RFC-3986</a>.
 */
@Documented
@Target(METHOD)
@Retention(RUNTIME)
public @interface FormUrlEncoded {
}

好了這裏沒有疑問,使用這個註解就已經自動會對Field註解的form數據進行Url編碼了,那爲何還會有encoded參數呢?框架

@Documented
@Target(PARAMETER)
@Retention(RUNTIME)
public @interface Field {
  String value();

  /** Specifies whether the {@linkplain #value() name} and value are already URL encoded. */
  boolean encoded() default false;
}

根據Field的註解說明,encoded參數表示是value是否已經Url編碼過了,那這個就好理解了,encoded=true表示value數據已經進行過Url 編碼了,那爲何打印出來的數據中除了加號其餘數據仍是進行Url編碼了呢?ui

只能接下來繼續看了this

RequestFactory對註解進行解析

//對註解FormUrlEncoded的解析,使用使用POST,PUT,PATCH的http方法中(hasBody=true)
if (annotation instanceof FormUrlEncoded) {
    if (isMultipart) {
       throw methodError(method, "Only one encoding annotation is allowed.");
    }
    isFormEncoded = true;
}

//對註解Field的解析
if (annotation instanceof Field) {
   validateResolvableType(p, type);
   if (!isFormEncoded) {
      throw parameterError(method, p, "@Field parameters can only be used with form encoding.");
   }
   Field field = (Field) annotation;
   String name = field.value();
   boolean encoded = field.encoded();

   gotField = true;

   Class<?> rawParameterType = Utils.getRawType(type);
   if (Iterable.class.isAssignableFrom(rawParameterType)) {
   if (!(type instanceof ParameterizedType)) {
      throw parameterError(method, p, rawParameterType.getSimpleName()
                + " must include generic type (e.g., "
                + rawParameterType.getSimpleName()
                + "<String>)");
      }
      ParameterizedType parameterizedType = (ParameterizedType) type;
      Type iterableType = Utils.getParameterUpperBound(0, parameterizedType);
      Converter<?, String> converter =
              retrofit.stringConverter(iterableType, annotations);
      return new ParameterHandler.Field<>(name, converter, encoded).iterable();
}

//若是使用了FormUrlEncoded註解而未用(Field註解或FieldMap註解),就會報錯
if (isFormEncoded && !gotField) {
        throw methodError(method, "Form-encoded method must contain at least one @Field.");
}

RequestFactory中會建立RequestBuilder,下面就查看RequestBuilder對Url編碼相關的處理了

構造方法裏若是isFormEncoded=true就會建立okhttp3.FormBody實例
if (isFormEncoded) {
   // Will be set to 'body' in 'build'.
   formBuilder = new FormBody.Builder();
} else if (isMultipart) {
   // Will be set to 'body' in 'build'.
   multipartBuilder = new MultipartBody.Builder();
   multipartBuilder.setType(MultipartBody.FORM);
}

//若是encoded=true就會調進formBuilder實例的addEncoded方法中
void addFormField(String name, String value, boolean encoded) {
    if (encoded) {
      formBuilder.addEncoded(name, value);
    } else {
      formBuilder.add(name, value);
    }
}

接下來就走進okttp的源碼裏了,最終的URL編碼是由okhttp源碼實現

FormBody的對應接口實現

//encoded=false會走進此方法中url編碼
fun add(name: String, value: String) = apply {
  names += name.canonicalize(
      encodeSet = FORM_ENCODE_SET,
      plusIsSpace = true,
      charset = charset
  )
  values += value.canonicalize(
      encodeSet = FORM_ENCODE_SET,
      plusIsSpace = true,
      charset = charset
  )
}

//encoded=true會走進此方法中進行url編碼
fun addEncoded(name: String, value: String) = apply {
   names += name.canonicalize(
      encodeSet = FORM_ENCODE_SET,
      alreadyEncoded = true,
      plusIsSpace = true,
      charset = charset
   )
   values += value.canonicalize(
      encodeSet = FORM_ENCODE_SET,
      alreadyEncoded = true,
      plusIsSpace = true,
      charset = charset
   )
}

//在HttpUrl類中
/**
     * Returns a substring of `input` on the range `[pos..limit)` with the following
     * transformations:
     *
     *  * Tabs, newlines, form feeds and carriage returns are skipped.
     *
     *  * In queries, ' ' is encoded to '+' and '+' is encoded to "%2B".
     *
     *  * Characters in `encodeSet` are percent-encoded.
     *
     *  * Control characters and non-ASCII characters are percent-encoded.
     *
     *  * All other characters are copied without transformation.
     *
     * @param alreadyEncoded true to leave '%' as-is; false to convert it to '%25'.
     * @param strict true to encode '%' if it is not the prefix of a valid percent encoding.
     * @param plusIsSpace true to encode '+' as "%2B" if it is not already encoded.
     * @param unicodeAllowed true to leave non-ASCII codepoint unencoded.
     * @param charset which charset to use, null equals UTF-8.
     */
    internal fun String.canonicalize(
      pos: Int = 0,
      limit: Int = length,
      encodeSet: String,
      alreadyEncoded: Boolean = false,
      strict: Boolean = false,
      plusIsSpace: Boolean = false,
      unicodeAllowed: Boolean = false,
      charset: Charset? = null
    ): String {
      var codePoint: Int
      var i = pos
      while (i < limit) {
        codePoint = codePointAt(i)
        if (codePoint < 0x20 ||
            codePoint == 0x7f ||
            codePoint >= 0x80 && !unicodeAllowed ||
            codePoint.toChar() in encodeSet ||
            codePoint == '%'.toInt() &&
            (!alreadyEncoded || strict && !isPercentEncoded(i, limit)) ||
            codePoint == '+'.toInt() && plusIsSpace) {
          // Slow path: the character at i requires encoding!
          val out = Buffer()
          out.writeUtf8(this, pos, i)
          out.writeCanonicalized(
              input = this,
              pos = i,
              limit = limit,
              encodeSet = encodeSet,
              alreadyEncoded = alreadyEncoded,
              strict = strict,
              plusIsSpace = plusIsSpace,
              unicodeAllowed = unicodeAllowed,
              charset = charset
          )
          return out.readUtf8()
        }
        i += Character.charCount(codePoint)
      }

      // Fast path: no characters in [pos..limit) required encoding.
      return substring(pos, limit)
    }

    private fun Buffer.writeCanonicalized(
      input: String,
      pos: Int,
      limit: Int,
      encodeSet: String,
      alreadyEncoded: Boolean,
      strict: Boolean,
      plusIsSpace: Boolean,
      unicodeAllowed: Boolean,
      charset: Charset?
    ) {
      var encodedCharBuffer: Buffer? = null // Lazily allocated.
      var codePoint: Int
      var i = pos
      while (i < limit) {
        codePoint = input.codePointAt(i)
        if (alreadyEncoded && (codePoint == '\t'.toInt() || codePoint == '\n'.toInt() ||
                codePoint == '\u000c'.toInt() || codePoint == '\r'.toInt())) {
          // Skip this character.
        } else if (codePoint == '+'.toInt() && plusIsSpace) {
          // Encode '+' as '%2B' since we permit ' ' to be encoded as either '+' or '%20'.
          writeUtf8(if (alreadyEncoded) "+" else "%2B")
        } else if (codePoint < 0x20 ||
            codePoint == 0x7f ||
            codePoint >= 0x80 && !unicodeAllowed ||
            codePoint.toChar() in encodeSet ||
            codePoint == '%'.toInt() &&
            (!alreadyEncoded || strict && !input.isPercentEncoded(i, limit))) {
          // Percent encode this character.
          if (encodedCharBuffer == null) {
            encodedCharBuffer = Buffer()
          }

          if (charset == null || charset == UTF_8) {
            encodedCharBuffer.writeUtf8CodePoint(codePoint)
          } else {
            encodedCharBuffer.writeString(input, i, i + Character.charCount(codePoint), charset)
          }

          while (!encodedCharBuffer.exhausted()) {
            val b = encodedCharBuffer.readByte().toInt() and 0xff
            writeByte('%'.toInt())
            writeByte(HEX_DIGITS[b shr 4 and 0xf].toInt())
            writeByte(HEX_DIGITS[b and 0xf].toInt())
          }
        } else {
          // This character doesn't need encoding. Just copy it over.
          writeUtf8CodePoint(codePoint)
        }
        i += Character.charCount(codePoint)
      }

結論

好了,由上面代碼可發現alreadyEncoded=true時,+並不會被編碼,因此緣由到這裏就找到了錯誤,客戶端的+號數據由於將encoded=true沒有被編碼,在服務端進行url 解碼後加號對應成空格,這也就是爲何服務端解析失敗了

if (codePoint == '+'.toInt() && plusIsSpace) {
   // Encode '+' as '%2B' since we permit ' ' to be encoded as either '+' or '%20'.
   writeUtf8(if (alreadyEncoded) "+" else "%2B")
}

默認encoded爲false,由okhttp底層進行url 編碼,但若是你想要本身對數據進行url編碼,將encoded設置爲true,不然不須要設置此參數,以下:

@FormUrlEncoded
@POST("user/upload")
Observable<Response<Boolean>> upload(@Field(value = "data") String data);

Retrofit和okhttp的源碼地址

https://github.com/square/okhttp

https://github.com/square/retrofit

發佈了26 篇原創文章 · 獲贊 0 · 訪問量 4萬+
相關文章
相關標籤/搜索