diff --git a/docs/content.zh/docs/connectors/table/formats/debezium.md b/docs/content.zh/docs/connectors/table/formats/debezium.md index 91a9a88d9f98a..a0b2f8d7b7a29 100644 --- a/docs/content.zh/docs/connectors/table/formats/debezium.md +++ b/docs/content.zh/docs/connectors/table/formats/debezium.md @@ -52,7 +52,7 @@ Flink 还支持将 Flink SQL 中的 INSERT / UPDATE / DELETE 消息编码为 Deb {{< sql_download_table "debezium-json" >}} -*注意: 请参考 [Debezium 文档](https://debezium.io/documentation/reference/1.3/index.html),了解如何设置 Debezium Kafka Connect 用来将变更日志同步到 Kafka 主题。* +*注意: 请参考 [Debezium 文档](https://debezium.io/documentation/reference/stable/index.html),了解如何设置 Debezium Kafka Connect 用来将变更日志同步到 Kafka 主题。* 如何使用 Debezium Format @@ -82,7 +82,7 @@ Debezium 为变更日志提供了统一的格式,这是一个 JSON 格式的 } ``` -*注意: 请参考 [Debezium 文档](https://debezium.io/documentation/reference/1.3/connectors/mysql.html#mysql-connector-events_debezium),了解每个字段的含义。* +*注意: 请参考 [Debezium 文档](https://debezium.io/documentation/reference/stable/connectors/mysql.html#mysql-events),了解每个字段的含义。* MySQL 产品表有4列(`id`、`name`、`description`、`weight`)。上面的 JSON 消息是 `products` 表上的一条更新事件,其中 `id = 111` 的行的 `weight` 值从 `5.18` 更改为 `5.15`。假设此消息已同步到 Kafka 主题 `products_binlog`,则可以使用以下 DDL(用于 Debezium JSON 和 Debezium Confluent Avro)来消费此主题并解析更改事件。 @@ -476,12 +476,12 @@ Flink 提供了 `debezium-avro-confluent` 和 `debezium-json` 两种 format 来 ### 消费 Debezium Postgres Connector 产生的数据 -如果你正在使用 [Debezium PostgreSQL Connector](https://debezium.io/documentation/reference/1.2/connectors/postgresql.html) 捕获变更到 Kafka,请确保被监控表的 [REPLICA IDENTITY](https://www.postgresql.org/docs/current/sql-altertable.html#SQL-CREATETABLE-REPLICA-IDENTITY) 已经被配置成 `FULL` 了,默认值是 `DEFAULT`。 +如果你正在使用 [Debezium PostgreSQL Connector](https://debezium.io/documentation/reference/stable/connectors/postgresql.html) 捕获变更到 Kafka,请确保被监控表的 [REPLICA IDENTITY](https://www.postgresql.org/docs/current/sql-altertable.html#SQL-CREATETABLE-REPLICA-IDENTITY) 已经被配置成 `FULL` 了,默认值是 `DEFAULT`。 否则,Flink SQL 将无法正确解析 Debezium 数据。 当配置为 `FULL` 时,更新和删除事件将完整包含所有列的之前的值。当为其他配置时,更新和删除事件的 "before" 字段将只包含 primary key 字段的值,或者为 null(没有 primary key)。 你可以通过运行 `ALTER TABLE REPLICA IDENTITY FULL` 来更改 `REPLICA IDENTITY` 的配置。 -请阅读 [Debezium 关于 PostgreSQL REPLICA IDENTITY 的文档](https://debezium.io/documentation/reference/1.2/connectors/postgresql.html#postgresql-replica-identity) 了解更多。 +请阅读 [Debezium 关于 PostgreSQL REPLICA IDENTITY 的文档](https://debezium.io/documentation/reference/stable/connectors/postgresql.html#postgresql-replica-identity) 了解更多。 数据类型映射 ---------------- diff --git a/docs/content.zh/docs/connectors/table/formats/ogg.md b/docs/content.zh/docs/connectors/table/formats/ogg.md index 61ec97b60fdfb..7ad40803375ce 100644 --- a/docs/content.zh/docs/connectors/table/formats/ogg.md +++ b/docs/content.zh/docs/connectors/table/formats/ogg.md @@ -84,7 +84,7 @@ Ogg 为变更日志提供了统一的格式, 这是一个 JSON 格式的从 Orac } ``` -*注意:请参考 [Debezium documentation](https://debezium.io/documentation/reference/1.3/connectors/oracle.html#oracle-events) +*注意:请参考 [Debezium documentation](https://debezium.io/documentation/reference/stable/connectors/oracle.html#oracle-events) 了解每个字段的含义.* Oracle `PRODUCTS` 表 有 4 列 (`id`, `name`, `description` and `weight`). 上面的 JSON 消息是 `PRODUCTS` 表上的一条更新事件,其中 `id = 111` 的行的 diff --git a/docs/content.zh/docs/deployment/adaptive_batch.md b/docs/content.zh/docs/deployment/adaptive_batch.md index 070cad353473c..323f560846d44 100644 --- a/docs/content.zh/docs/deployment/adaptive_batch.md +++ b/docs/content.zh/docs/deployment/adaptive_batch.md @@ -167,6 +167,6 @@ Adaptive Batch Scheduler 默认启用 Skewed Join 优化,你可以通过配 - **只支持 AdaptiveBatchScheduler**: 不过由于 Adaptive Batch Scheduler 是 Flink 默认的批作业调度器,无需额外配置。除非用户显式的配置了使用其他调度器,例如 [`jobmanager.scheduler: default`]。 - **只支持所有数据交换都为 BLOCKING 或 HYBRID 模式的作业**: 目前 Adaptive Batch Scheduler 只支持 [shuffle mode]({{< ref "docs/deployment/config" >}}#execution-batch-shuffle-mode) 为 ALL_EXCHANGES_BLOCKING 或 ALL_EXCHANGES_HYBRID_FULL 或 ALL_EXCHANGES_HYBRID_SELECTIVE 的作业。请注意,使用 DataSet API 的作业无法识别上述 shuffle 模式,需要将 ExecutionMode 设置为 BATCH_FORCED 才能强制启用 BLOCKING shuffle。 - **不支持 FileInputFormat 类型的 source**: 不支持 FileInputFormat 类型的 source, 包括 `StreamExecutionEnvironment#readFile(...)` 和 `StreamExecutionEnvironment#createInput(FileInputFormat, ...)`。 当使用 Adaptive Batch Scheduler 时,用户应该使用新版的 Source API ([FileSystem DataStream Connector]({{< ref "docs/connectors/datastream/filesystem.md" >}}) 或 [FileSystem SQL Connector]({{< ref "docs/connectors/table/filesystem.md" >}})) 来读取文件. -- **Web UI 上展示的上游输出的数据量和下游收到的数据量可能不一致**: 在使用 Adaptive Batch Scheduler 自动推导并行度时,对于 broadcast 边,上游发送的数据量是基于下游最大并行度估算的结果,与下游算子实际接收的数据量可能会不相等,这在 Web UI 的显示上可能会困扰用户。细节详见 [FLIP-187](https://cwiki.apache.org/confluence/display/FLINK/FLIP-187%3A+Adaptive+Batch+Job+Scheduler)。 +- **Web UI 上展示的上游输出的数据量和下游收到的数据量可能不一致**: 在使用 Adaptive Batch Scheduler 自动推导并行度时,对于 broadcast 边,上游发送的数据量是基于下游最大并行度估算的结果,与下游算子实际接收的数据量可能会不相等,这在 Web UI 的显示上可能会困扰用户。细节详见 [FLIP-187](https://cwiki.apache.org/confluence/display/FLINK/FLIP-187%3A+Adaptive+Batch+Scheduler)。 {{< top >}} diff --git a/docs/content.zh/docs/deployment/ha/zookeeper_ha.md b/docs/content.zh/docs/deployment/ha/zookeeper_ha.md index 57f0c8bbe8ebc..d1f10b417136f 100644 --- a/docs/content.zh/docs/deployment/ha/zookeeper_ha.md +++ b/docs/content.zh/docs/deployment/ha/zookeeper_ha.md @@ -122,7 +122,7 @@ This behaviour might be too disruptive in some cases (e.g., unstable network env If you are willing to take a more aggressive approach, then you can tolerate suspended ZooKeeper connections and only treat lost connections as an error via [high-availability.zookeeper.client.tolerate-suspended-connections]({{< ref "docs/deployment/config" >}}#high-availability-zookeeper-client-tolerate-suspended-connection). Enabling this feature will make Flink more resilient against temporary connection problems but also increase the risk of running into ZooKeeper timing problems. -For more information take a look at [Curator's error handling](https://curator.apache.org/errors.html). +For more information take a look at [Curator's error handling](https://curator.apache.org/docs/errors). diff --git a/docs/content.zh/docs/deployment/speculative_execution.md b/docs/content.zh/docs/deployment/speculative_execution.md index f3678ec55a69f..3ab043dd229db 100644 --- a/docs/content.zh/docs/deployment/speculative_execution.md +++ b/docs/content.zh/docs/deployment/speculative_execution.md @@ -78,7 +78,7 @@ public interface SupportsHandleExecutionAttemptSourceEvent { 这意味着 SplitEnumerator 需要知道是哪个执行实例发出了这个事件。否则,JobManager 会在收到 SourceEvent 的时候报错从而导致作业失败。 除此之外的 Source 不需要额外的改动就可以进行预测执行,包括 -{{< gh_link file="/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java" name="SourceFunction Source" >}}, +{{< gh_link file="/flink-runtime/src/main/java/org/apache/flink/streaming/api/functions/source/legacy/SourceFunction.java" name="SourceFunction Source" >}}, {{< gh_link file="/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java" name="InputFormat Source" >}}, 和 {{< gh_link file="/flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java" name="新版 Source" >}}. Apache Flink 官方提供的 Source 都支持预测执行。 @@ -91,7 +91,7 @@ public interface SupportsConcurrentExecutionAttempts {} ``` 接口 {{< gh_link file="/flink-core/src/main/java/org/apache/flink/api/common/SupportsConcurrentExecutionAttempts.java" name="SupportsConcurrentExecutionAttempts" >}} 适用于 {{< gh_link file="/flink-core/src/main/java/org/apache/flink/api/connector/sink2/Sink.java" name="Sink" >}} -,{{< gh_link file="/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/sink/SinkFunction.java" name="SinkFunction" >}} +,{{< gh_link file="/flink-runtime/src/main/java/org/apache/flink/streaming/api/functions/sink/legacy/SinkFunction.java" name="SinkFunction" >}} 以及 {{< gh_link file="/flink-core/src/main/java/org/apache/flink/api/common/io/OutputFormat.java" name="OutputFormat" >}}。 {{< hint info >}} @@ -101,7 +101,7 @@ public interface SupportsConcurrentExecutionAttempts {} {{< hint info >}} 对于 {{< gh_link file="/flink-core/src/main/java/org/apache/flink/api/connector/sink2/Sink.java" name="Sink" >}} 实现, Flink 会关闭 {{< gh_link file="/flink-core/src/main/java/org/apache/flink/api/connector/sink2/Committer.java" name="Committer" >}} 的预测执行, -(包括被 {{< gh_link file="/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/WithPreCommitTopology.java" name="WithPreCommitTopology" >}} 和 {{< gh_link file="/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/WithPostCommitTopology.java" name="WithPostCommitTopology" >}} 扩展的算子)。 +(包括被 {{< gh_link file="/flink-runtime/src/main/java/org/apache/flink/streaming/api/connector/sink2/SupportsPreCommitTopology.java" name="SupportsPreCommitTopology" >}} 和 {{< gh_link file="/flink-runtime/src/main/java/org/apache/flink/streaming/api/connector/sink2/SupportsPostCommitTopology.java" name="SupportsPostCommitTopology" >}} 扩展的算子)。 因为如果用户对并行提交理解不深的话,这里可能会引起意料之外的问题。另外一个原因是提交的部分往往不是批作业的瓶颈所在。 {{< /hint >}} diff --git a/docs/content.zh/docs/dev/configuration/overview.md b/docs/content.zh/docs/dev/configuration/overview.md index 0956ee3267713..2c31bd2ca7554 100644 --- a/docs/content.zh/docs/dev/configuration/overview.md +++ b/docs/content.zh/docs/dev/configuration/overview.md @@ -123,7 +123,7 @@ repositories { } // NOTE: We cannot use "compileOnly" or "shadow" configurations since then we could not run code // in the IDE or with "gradle run". We also cannot exclude transitive dependencies from the -// shadowJar yet (see https://github.com/johnrengelman/shadow/issues/159). +// shadowJar yet (see https://github.com/GradleUp/shadow/issues/159). // -> Explicitly define the // libraries we want to be included in the "flinkShadowJar" configuration! configurations { flinkShadowJar // dependencies which go into the shadowJar diff --git a/docs/content.zh/docs/dev/datastream/fault-tolerance/serialization/schema_evolution.md b/docs/content.zh/docs/dev/datastream/fault-tolerance/serialization/schema_evolution.md index 8df6ddeb31647..629b4d09c6cad 100644 --- a/docs/content.zh/docs/dev/datastream/fault-tolerance/serialization/schema_evolution.md +++ b/docs/content.zh/docs/dev/datastream/fault-tolerance/serialization/schema_evolution.md @@ -93,7 +93,7 @@ Flink 基于下面的规则来支持 [POJO 类型]({{< ref "docs/dev/datastream/ ### Avro 类型 Flink 完全支持 Avro 状态类型的升级,只要数据结构的修改是被 -[Avro 的数据结构解析规则](http://avro.apache.org/docs/current/spec.html#Schema+Resolution)认为兼容的即可。 +[Avro 的数据结构解析规则](https://avro.apache.org/docs/current/specification/)认为兼容的即可。 一个例外是如果新的 Avro 数据 schema 生成的类无法被重定位或者使用了不同的命名空间,在作业恢复时状态数据会被认为是不兼容的。 diff --git a/docs/content.zh/docs/dev/datastream/operators/windows.md b/docs/content.zh/docs/dev/datastream/operators/windows.md index 0c113811bce7d..8524ad2cd291f 100644 --- a/docs/content.zh/docs/dev/datastream/operators/windows.md +++ b/docs/content.zh/docs/dev/datastream/operators/windows.md @@ -1107,7 +1107,7 @@ Flink 包含一些内置 trigger。 * `PurgingTrigger` 接收另一个 trigger 并将它转换成一个会清理数据的 trigger。 如果你需要实现自定义的 trigger,你应该看看这个抽象类 -{{< gh_link file="/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/triggers/Trigger.java" name="Trigger" >}} 。 +{{< gh_link file="/flink-runtime/src/main/java/org/apache/flink/streaming/api/windowing/triggers/Trigger.java" name="Trigger" >}} 。 请注意,这个 API 仍在发展,所以在之后的 Flink 版本中可能会发生变化。 ## Evictors diff --git a/docs/content.zh/docs/dev/table/table_environment.md b/docs/content.zh/docs/dev/table/table_environment.md index 80d4108957b2d..d48b74851c06a 100644 --- a/docs/content.zh/docs/dev/table/table_environment.md +++ b/docs/content.zh/docs/dev/table/table_environment.md @@ -138,7 +138,7 @@ These APIs are used to create/remove Table API/SQL Tables and write queries: | `executeSql(stmt)` | `execute_sql(stmt)` | Executes the given SQL statement and returns the execution result. Supports DDL/DML/DQL/SHOW/DESCRIBE/EXPLAIN/USE statements. | | `sqlQuery(query)` | `sql_query(query)` | Evaluates a SQL query and retrieves the result as a Table. | -See the [Java API docs](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/table/api/TableEnvironment.html) and [Python API docs](https://nightlies.apache.org/flink/flink-docs-master/api/python/reference/pyflink.table/api/pyflink.table.TableEnvironment.html) for complete API reference. +See the [Java API docs](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/table/api/TableEnvironment.html) and [Python API docs](https://nightlies.apache.org/flink/flink-docs-master/api/python/reference/pyflink.table/table_environment.html) for complete API reference. Deprecated APIs diff --git a/docs/content.zh/docs/sql/hive-compatibility/hiveserver2.md b/docs/content.zh/docs/sql/hive-compatibility/hiveserver2.md index 2d0539857e1d0..8ce074c2b0fa6 100644 --- a/docs/content.zh/docs/sql/hive-compatibility/hiveserver2.md +++ b/docs/content.zh/docs/sql/hive-compatibility/hiveserver2.md @@ -279,7 +279,7 @@ After the setup, you can explore Flink with DBeaver. ### Apache Superset Apache Superset is a powerful data exploration and visualization platform. With the API compatibility, you can connect -to the Flink SQL Gateway like Hive. Please refer to the [guidance](https://superset.apache.org/docs/databases/hive) for more details. +to the Flink SQL Gateway like Hive. Please refer to the [guidance](https://superset.apache.org/user-docs/databases/supported/apache-hive/) for more details. {{< img width="80%" src="/fig/apache_superset.png" alt="Apache Superset" >}} diff --git a/docs/content/docs/connectors/table/formats/debezium.md b/docs/content/docs/connectors/table/formats/debezium.md index 9e2f60f23a6ab..54f36448887e2 100644 --- a/docs/content/docs/connectors/table/formats/debezium.md +++ b/docs/content/docs/connectors/table/formats/debezium.md @@ -53,7 +53,7 @@ Dependencies {{< sql_download_table "debezium-json" >}} -*Note: please refer to [Debezium documentation](https://debezium.io/documentation/reference/1.3/index.html) about how to setup a Debezium Kafka Connect to synchronize changelog to Kafka topics.* +*Note: please refer to [Debezium documentation](https://debezium.io/documentation/reference/stable/index.html) about how to setup a Debezium Kafka Connect to synchronize changelog to Kafka topics.* How to use Debezium format @@ -82,7 +82,7 @@ Debezium provides a unified format for changelog, here is a simple example for a } ``` -*Note: please refer to [Debezium documentation](https://debezium.io/documentation/reference/1.3/connectors/mysql.html#mysql-connector-events_debezium) about the meaning of each fields.* +*Note: please refer to [Debezium documentation](https://debezium.io/documentation/reference/stable/connectors/mysql.html#mysql-events) about the meaning of each fields.* The MySQL `products` table has 4 columns (`id`, `name`, `description` and `weight`). The above JSON message is an update change event on the `products` table where the `weight` value of the row with `id = 111` is changed from `5.18` to `5.15`. Assuming this messages is synchronized to Kafka topic `products_binlog`, then we can use the following DDLs (for Debezium JSON and Debezium Confluent Avro) to consume this topic and interpret the change events. @@ -471,12 +471,12 @@ Framework will generate an additional stateful operator, and use the primary key ### Consuming data produced by Debezium Postgres Connector -If you are using [Debezium Connector for PostgreSQL](https://debezium.io/documentation/reference/1.2/connectors/postgresql.html) to capture the changes to Kafka, please make sure the [REPLICA IDENTITY](https://www.postgresql.org/docs/current/sql-altertable.html#SQL-CREATETABLE-REPLICA-IDENTITY) configuration of the monitored PostgreSQL table has been set to `FULL` which is by default `DEFAULT`. +If you are using [Debezium Connector for PostgreSQL](https://debezium.io/documentation/reference/stable/connectors/postgresql.html) to capture the changes to Kafka, please make sure the [REPLICA IDENTITY](https://www.postgresql.org/docs/current/sql-altertable.html#SQL-CREATETABLE-REPLICA-IDENTITY) configuration of the monitored PostgreSQL table has been set to `FULL` which is by default `DEFAULT`. Otherwise, Flink SQL currently will fail to interpret the Debezium data. In `FULL` strategy, the UPDATE and DELETE events will contain the previous values of all the table’s columns. In other strategies, the "before" field of UPDATE and DELETE events will only contain primary key columns or null if no primary key. You can change the `REPLICA IDENTITY` by running `ALTER TABLE REPLICA IDENTITY FULL`. -See more details in [Debezium Documentation for PostgreSQL REPLICA IDENTITY](https://debezium.io/documentation/reference/1.2/connectors/postgresql.html#postgresql-replica-identity). +See more details in [Debezium Documentation for PostgreSQL REPLICA IDENTITY](https://debezium.io/documentation/reference/stable/connectors/postgresql.html#postgresql-replica-identity). Data Type Mapping ---------------- diff --git a/docs/content/docs/connectors/table/formats/ogg.md b/docs/content/docs/connectors/table/formats/ogg.md index 3b53916e36d95..fdc84ad3c63b8 100644 --- a/docs/content/docs/connectors/table/formats/ogg.md +++ b/docs/content/docs/connectors/table/formats/ogg.md @@ -92,7 +92,7 @@ captured from an Oracle `PRODUCTS` table in JSON format: ``` *Note: please refer -to [Debezium documentation](https://debezium.io/documentation/reference/1.3/connectors/oracle.html#oracle-events) +to [Debezium documentation](https://debezium.io/documentation/reference/stable/connectors/oracle.html#oracle-events) about the meaning of each field.* The Oracle `PRODUCTS` table has 4 columns (`id`, `name`, `description` and `weight`). The above JSON diff --git a/docs/content/docs/deployment/adaptive_batch.md b/docs/content/docs/deployment/adaptive_batch.md index 75f1e08838d6d..54f3f065df6e4 100644 --- a/docs/content/docs/deployment/adaptive_batch.md +++ b/docs/content/docs/deployment/adaptive_batch.md @@ -171,6 +171,6 @@ Additionally, you can adjust the following configurations based on the character - **AdaptiveBatchScheduler only**: It only takes effect when using the AdaptiveBatchScheduler. And since the Adaptive Batch Scheduler is the default batch job scheduler in Flink, no additional configuration is required unless the user explicitly configures to use another scheduler, such as [`jobmanager.scheduler: default`]. - **BLOCKING or HYBRID jobs only**: At the moment, Adaptive Batch Scheduler only supports jobs whose [shuffle mode]({{< ref "docs/deployment/config" >}}#execution-batch-shuffle-mode) is `ALL_EXCHANGES_BLOCKING / ALL_EXCHANGES_HYBRID_FULL / ALL_EXCHANGES_HYBRID_SELECTIVE`. - **FileInputFormat sources are not supported**: FileInputFormat sources are not supported, including `StreamExecutionEnvironment#readFile(...)` and `StreamExecutionEnvironment#createInput(FileInputFormat, ...)`. Users should use the new sources([FileSystem DataStream Connector]({{< ref "docs/connectors/datastream/filesystem.md" >}}) or [FileSystem SQL Connector]({{< ref "docs/connectors/table/filesystem.md" >}})) to read files when using the Adaptive Batch Scheduler. -- **Inconsistent broadcast results metrics on WebUI**: When use Adaptive Batch Scheduler to automatically decide parallelisms for operators, for broadcast results, the number of bytes/records sent by the upstream task counted by metric is not equal to the number of bytes/records received by the downstream task, which may confuse users when displayed on the Web UI. See [FLIP-187](https://cwiki.apache.org/confluence/display/FLINK/FLIP-187%3A+Adaptive+Batch+Job+Scheduler) for details. +- **Inconsistent broadcast results metrics on WebUI**: When use Adaptive Batch Scheduler to automatically decide parallelisms for operators, for broadcast results, the number of bytes/records sent by the upstream task counted by metric is not equal to the number of bytes/records received by the downstream task, which may confuse users when displayed on the Web UI. See [FLIP-187](https://cwiki.apache.org/confluence/display/FLINK/FLIP-187%3A+Adaptive+Batch+Scheduler) for details. {{< top >}} diff --git a/docs/content/docs/deployment/ha/zookeeper_ha.md b/docs/content/docs/deployment/ha/zookeeper_ha.md index 07de0edc62996..e65bd0e03ffd3 100644 --- a/docs/content/docs/deployment/ha/zookeeper_ha.md +++ b/docs/content/docs/deployment/ha/zookeeper_ha.md @@ -129,7 +129,7 @@ This behaviour might be too disruptive in some cases (e.g., unstable network env If you are willing to take a more aggressive approach, then you can tolerate suspended ZooKeeper connections and only treat lost connections as an error via [high-availability.zookeeper.client.tolerate-suspended-connections]({{< ref "docs/deployment/config" >}}#high-availability-zookeeper-client-tolerate-suspended-connection). Enabling this feature will make Flink more resilient against temporary connection problems but also increase the risk of running into ZooKeeper timing problems. -For more information take a look at [Curator's error handling](https://curator.apache.org/errors.html). +For more information take a look at [Curator's error handling](https://curator.apache.org/docs/errors). ## Bootstrap ZooKeeper diff --git a/docs/content/docs/deployment/speculative_execution.md b/docs/content/docs/deployment/speculative_execution.md index 883ee477b1169..e8c43acb2a22e 100644 --- a/docs/content/docs/deployment/speculative_execution.md +++ b/docs/content/docs/deployment/speculative_execution.md @@ -92,7 +92,7 @@ This means the SplitEnumerator should be aware of the attempt which sends the ev will happen when the job manager receives a source event from the tasks and lead to job failures. No extra change is required for other sources to work with speculative execution, including -{{< gh_link file="/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java" name="SourceFunction sources" >}}, +{{< gh_link file="/flink-runtime/src/main/java/org/apache/flink/streaming/api/functions/source/legacy/SourceFunction.java" name="SourceFunction sources" >}}, {{< gh_link file="/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java" name="InputFormat sources" >}}, and {{< gh_link file="/flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java" name="new sources" >}}. All the source connectors offered by Apache Flink can work with speculative execution. @@ -105,7 +105,7 @@ public interface SupportsConcurrentExecutionAttempts {} ``` The {{< gh_link file="/flink-core/src/main/java/org/apache/flink/api/common/SupportsConcurrentExecutionAttempts.java" name="SupportsConcurrentExecutionAttempts" >}} works for {{< gh_link file="/flink-core/src/main/java/org/apache/flink/api/connector/sink2/Sink.java" name="Sink" >}} -, {{< gh_link file="/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/sink/SinkFunction.java" name="SinkFunction" >}} +, {{< gh_link file="/flink-runtime/src/main/java/org/apache/flink/streaming/api/functions/sink/legacy/SinkFunction.java" name="SinkFunction" >}} and {{< gh_link file="/flink-core/src/main/java/org/apache/flink/api/common/io/OutputFormat.java" name="OutputFormat" >}}. {{< hint info >}} @@ -116,7 +116,7 @@ That means if the Sink does not support speculative execution, the task containi {{< hint info >}} For the {{< gh_link file="/flink-core/src/main/java/org/apache/flink/api/connector/sink2/Sink.java" name="Sink" >}} implementation, Flink disables speculative execution for {{< gh_link file="/flink-core/src/main/java/org/apache/flink/api/connector/sink2/Committer.java" name="Committer" >}}( -including the operators extended by {{< gh_link file="/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/WithPreCommitTopology.java" name="WithPreCommitTopology" >}} and {{< gh_link file="/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/WithPostCommitTopology.java" name="WithPostCommitTopology" >}}). +including the operators extended by {{< gh_link file="/flink-runtime/src/main/java/org/apache/flink/streaming/api/connector/sink2/SupportsPreCommitTopology.java" name="SupportsPreCommitTopology" >}} and {{< gh_link file="/flink-runtime/src/main/java/org/apache/flink/streaming/api/connector/sink2/SupportsPostCommitTopology.java" name="SupportsPostCommitTopology" >}}). Because the concurrent committing may cause some unexpected problems if the user is not experienced with it. And the committer is very unlikely to be the bottleneck of the batch job. {{< /hint >}} diff --git a/docs/content/docs/dev/configuration/overview.md b/docs/content/docs/dev/configuration/overview.md index b775b5755611b..4f044c0109759 100644 --- a/docs/content/docs/dev/configuration/overview.md +++ b/docs/content/docs/dev/configuration/overview.md @@ -125,7 +125,7 @@ repositories { } // NOTE: We cannot use "compileOnly" or "shadow" configurations since then we could not run code // in the IDE or with "gradle run". We also cannot exclude transitive dependencies from the -// shadowJar yet (see https://github.com/johnrengelman/shadow/issues/159). +// shadowJar yet (see https://github.com/GradleUp/shadow/issues/159). // -> Explicitly define the // libraries we want to be included in the "flinkShadowJar" configuration! configurations { flinkShadowJar // dependencies which go into the shadowJar diff --git a/docs/content/docs/dev/datastream/fault-tolerance/serialization/schema_evolution.md b/docs/content/docs/dev/datastream/fault-tolerance/serialization/schema_evolution.md index d5ea3c798ff09..4bb16308e0e27 100644 --- a/docs/content/docs/dev/datastream/fault-tolerance/serialization/schema_evolution.md +++ b/docs/content/docs/dev/datastream/fault-tolerance/serialization/schema_evolution.md @@ -103,7 +103,7 @@ newer than 1.8.0. When restoring with Flink versions older than 1.8.0, the schem ### Avro types Flink fully supports evolving schema of Avro type state, as long as the schema change is considered compatible by -[Avro's rules for schema resolution](http://avro.apache.org/docs/current/spec.html#Schema+Resolution). +[Avro's rules for schema resolution](https://avro.apache.org/docs/current/specification/). One limitation is that Avro generated classes used as the state type cannot be relocated or have different namespaces when the job is restored. diff --git a/docs/content/docs/dev/datastream/operators/windows.md b/docs/content/docs/dev/datastream/operators/windows.md index 6ef7c6492eca6..49e7ff8be00dc 100644 --- a/docs/content/docs/dev/datastream/operators/windows.md +++ b/docs/content/docs/dev/datastream/operators/windows.md @@ -1133,7 +1133,7 @@ Flink comes with a few built-in triggers. * The `PurgingTrigger` takes as argument another trigger and transforms it into a purging one. If you need to implement a custom trigger, you should check out the abstract -{{< gh_link file="/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/triggers/Trigger.java" name="Trigger" >}} class. +{{< gh_link file="/flink-runtime/src/main/java/org/apache/flink/streaming/api/windowing/triggers/Trigger.java" name="Trigger" >}} class. Please note that the API is still evolving and might change in future versions of Flink. ## Evictors diff --git a/docs/content/docs/dev/table/table_environment.md b/docs/content/docs/dev/table/table_environment.md index 2f3de03debaa6..e402e7392e211 100644 --- a/docs/content/docs/dev/table/table_environment.md +++ b/docs/content/docs/dev/table/table_environment.md @@ -138,7 +138,7 @@ These APIs are used to create/remove Table API/SQL Tables and write queries: | `executeSql(stmt)` | `execute_sql(stmt)` | Executes the given SQL statement and returns the execution result. Supports DDL/DML/DQL/SHOW/DESCRIBE/EXPLAIN/USE statements. | | `sqlQuery(query)` | `sql_query(query)` | Evaluates a SQL query and retrieves the result as a Table. | -See the [Java API docs](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/table/api/TableEnvironment.html) and [Python API docs](https://nightlies.apache.org/flink/flink-docs-master/api/python/reference/pyflink.table/api/pyflink.table.TableEnvironment.html) for complete API reference. +See the [Java API docs](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/table/api/TableEnvironment.html) and [Python API docs](https://nightlies.apache.org/flink/flink-docs-master/api/python/reference/pyflink.table/table_environment.html) for complete API reference. Deprecated APIs diff --git a/docs/content/docs/sql/interfaces/sql-gateway/hiveserver2.md b/docs/content/docs/sql/interfaces/sql-gateway/hiveserver2.md index 20dfc4b875d39..12a1706e3fd94 100644 --- a/docs/content/docs/sql/interfaces/sql-gateway/hiveserver2.md +++ b/docs/content/docs/sql/interfaces/sql-gateway/hiveserver2.md @@ -282,7 +282,7 @@ After the setup, you can explore Flink with DBeaver. ### Apache Superset Apache Superset is a powerful data exploration and visualization platform. With the API compatibility, you can connect -to the Flink SQL Gateway like Hive. Please refer to the [guidance](https://superset.apache.org/docs/databases/hive) for more details. +to the Flink SQL Gateway like Hive. Please refer to the [guidance](https://superset.apache.org/user-docs/databases/supported/apache-hive/) for more details. {{< img width="80%" src="/fig/apache_superset.png" alt="Apache Superset" >}}